Openai showed his first AI agent, the operator last week, but he already has a Scrappy competitor offering an AI tool called Browser Use which can perform online tasks for you. This computer user agent (CUA) can write, search, click on buttons and copy information from websites without you having to touch the mouse or keyboard and without the Chatgpt Pro subscription $ 200 per month.
The use of the browser is actually free, at least if you want and capable of spending time playing with the API code. I am not very literate by the code, but I naively thought that I knew enough how GitHub works to use the API version. Hours of passing by the documentation, the adjustments of settings and the examples of looks later, I decided that it would require a deeper level of coding knowledge than me, not to mention the average person traveling the web.
Fortunately, for me, Browser Use has just started a cloud version that uses the OPENAI’s own GPT-4O model. It reduces a large part of the heavy technical lifting and rationalizes things in a more familiar cat format without any additional work. It has its limits and costs $ 30, but after my inept API mess, it looked like a good deal. And even in this form (always obviously unfinished), you must always make efforts in engineering prompts and negotiate the functioning of AI. The most limiting aspect is that you can only make one prompt before starting a new interaction. Despite the text box, you cannot respond to what AI does and refine your request.
Buy AI
With everything that is configured, I put the use of the browser through a few real world tests. The first was a price for price comparison. I entered the prompt: “Access Amazon, Best Buy and Walmart and search for” MacBook Air M2 “. One. If the discounts are present, save them.
It did the job well, although he found no discount or hidden coupons. However, the fact that I could automate prices monitoring on several sites was quite exciting. That said, a continuous problem for any agent like this occurs when a website wants to verify that you are human. The use of the browser has a button that allows you to take over when you wish, but it will also alert you if necessary. You can prove your humanity, then hit the curriculum vitae to let the AI take over.
Fly ai
Then came a travel planning task with the prompt: “Look for a round trip from New York to London on December 15, 2025 on British Air. Select the cheapest option and extraction details, including the Price, airline and departure time. “
Use of the browser delivered, drawing a British Airways flight at $ 750, with an hour of departure and other relevant details. This could be incredibly useful for people who reserve a lot of trips, especially if you automate it to regularly check price reductions.
Good weather a friend
Finally, I tested meteorological prediction and planning with the prompt: “Check the 7 -day weather forecast for New York on Weather.com and summarizes temperature trends, rain chances and serious weather warnings, Then suggest how to dress for it. “”
Time is one of the most popular uses for vocal assistants, so I wanted to see how AI treated a more complex demand in this vein. It has not very successful, not only extracts the information from the forecasts, but suggesting which days to wear a light coat and which days I should “isolate with a coat and a hot scarf, because it will be cold with a low risk of rain”.
Hiking
The main difference between the two is accessibility. The use of the browser is like a Swiss knife for developers. It has the flexibility to do almost anything in a browser, but you should know how to use the tools. You can dig into the code, modify it and mold it to your exact needs. If a functionality is missing, nothing prevents you from adding it. The use of the browser, being open-source, also has a community of developers active constantly refine it. This means that if you encounter problems, there are github forums and discussions where you can probably find answers.
The operator of Openai, on the other hand, is like the hiring of a butler. It does a lot for you but in some constraints. The operator’s strength is its integration with the wider IA ecosystem of OPENAI, which gives it access to proprietary models which can make more nuanced decisions. However, you are locked in the OpenAi pricing structure and limited personalization options.
The use of the browser is not perfect. Even its cloud version requires a certain patience. You must carefully make your prompts, prepare to troubleshoot and start again. The Cloud version can compensate for part of this later, but for the moment, the limits of not being able to modify or responding in the conversation have put difficult limits to its otherwise flexible nature.
And speed can also be frustrating. Discover a video of my second test; It is four times the speed of the real process.
Currently, the use of the browser is better suited to people who like DIY, such as developers, researchers and automation geeks who do not care to get your hands dirty. If you are ready to make the effort, you will get a powerful and flexible tool that costs much less than its competitors.
But if you prefer not to spend your weekend fighting with configuration files, the operator can be the most indulgent option. Anyway, web automation is ready for a boom.