Adding eyes to artificial intelligence is always a delicate thing. Do you want him to see everything you do all the time? Certainly not, but I think most of us agree that AI visual assistance when you need it could be very practical. Microsoft’s new Copilot vision can be one of the most promising applications of visual capacities based on the AI that I saw again.
Microsoft has unveiled the Copilot Vision update for its Windows application and its mobile applications (you can point your camera on things, and the vision can identify them for you) during a combined co -pilot and Microsoft 50th Anniversary.
Copilot almost obtained a brain transplant, using the generative models Homegrown (Microsoft AI or Mai) and Openai GPT to provide updates through memory, research, personalization and vision capacities.
Now that I have seen Copilot Vision in action, I can tell you that this is one of the most exciting and most important updates in the group – even if it arrives in two stages.
In the version you can access for your Windows support desktop app, Copilot Vision can see the applications you are running on the desktop. When you open Copilot – by selecting the icon or by pressing your COPILOT button on your keyboard – you can now select the new glasses icon.
This allows you to see a list of open applications; In our case, we had two colors: 3D blender and clipchamp. This means that although Copilot is aware of the available applications during execution on Windows, it does not look automatically.
We have selected Blender 3D, and from this moment forward, something in my windows has changed. I realized that Copilot can really see which application you use, and instead of guessing your intention, it responds according to the application and even the project on which you work.
A 3D coffee table project was open, and using our voice, we asked how to make the design of the table more traditional. Our prompt contained almost no details on the application or the project, but Copilot’s response, in a pretty baritone, was entirely contextual.
We then changed and asked how to make annotations in the application. Copilot started to answer, but we interrupted and asked where to find the icon to add the annotations. Copilot quickly adjusted and quickly explained to us how to find it.
This could be extremely useful because you no longer break your flow to jump to search or even explain the application you use or the project too much. Copilot Vision sees and knows.
Let me tell you, however, of what will happen.
We followed the same steps to open the co -pilot and access the vision component, but this time, we pointed out the co -pilot of our Open Clipchamp project.
We asked Copilot how to make our video transitions more transparent. Instead of a text prompt explaining what to do, Copilot Vision showed us exactly where to find the necessary tool in the application.
A giant arrow (inside an animated circle) appeared on the screen, pointing the transitional tool which he recommended to use as he explained the necessary steps. We have crossed this demo several times, and because of its nature still underdevelopment, it has not always worked.
When he did, however, this highlighted a potentially exciting change in the way we work with applications in Windows.
We have also seen a demonstration video that shows Copilot’s vision by digging even more deeply in the Photoshop application to find the right tools. This, my friends, is clippy on steroids.
Imagine the future where you use text prompts or your voice to understand how to perform tasks in an open application, and Copilot Vision takes you digitally and guides you. There is no sign that it will take actions in terms of application on your behalf, but it could be an incredible visual assistant.
The good news is that the vision of the co -pilot which knows at least on which application and on which project you work is available now. The bad news is that the vision of the co -pilot I really want has no final calendar. But I must assume that it will not be long. We saw it live after all.