Thanks to visit codestin.com
Credit goes to devpost.com

Inspiration

42.5M Americans live with disability. At the same time, 97.4% of websites fail WCAG (Web Content Accessibility Guidelines). This has resulted in disabled Americans being 3x more likely not to use the Internet.

And navigating the web isn't always straightforward -- especially if you're living with visual or physical impairments. Existing accessibility tools like JAWS or NVDA are useful -- but not intelligent. It's difficult for users to translate their intents into actions.

Agentic flips this paradigm on its head, enabling users to go from intent directly to action using natural language. Beyond text-to-text, text-to-image, and text-to-video, we believe Agentic represents the next step forward in AI: text-to-action.

What it does

Using just natural language, users are able to use Agentic to navigate the web, perform multi-step operations, and interact with webpages. Through a complex pipeline built around HTML (more on that below), raw HTML is transformed into a format AI can more easily interact with, allowing Agentic to truly "see" and "act" with HTML as the source of truth.

How we built it

On the Frontend, Agentic is built with React, Typescript, and Next.js. We leveraged OpenAI's Whisper with Transformers.js and ElevenLabs's API to create an immersive Speech to Text and Text to Speech interface.

On the Backend, we used Node.js, Selenium, and Google AI (Gemini Pro).

Challenges we ran into

Interacting with the large language model proved to be our largest challenge. Gemini would frequently hallucinate, respond to itself, and fail to follow instructions. This was mitigated by intensive prompt engineering the introduction of system checks to ensure proper operation and attempt to automatically correct catch and correct bad output. LLM interaction would have benefitted from more development time, but overall this system helps increase overall output quality.

Additionally, this was, for a majority of our team, our first (‼️) hackathon, so navigating this new environment was an absolute thrill as well.

Also, it got cold at times, so sleeping was not always very fun.

Accomplishments that we're proud of

  • Finishing a lot of candy.
  • Learning to like Subway.
  • Building a minimum viable model that demonstrates the incredible capabilities of Large Action Models.
  • 1300+ mg of caffeine consumed in total

What's next for Agentic

While we're very proud of our work on Agentic, we know there's much more technical work to be done in service of creating a general purpose model that can take on every aspect of the web. But beyond advancements in the model, we plan to implement and maintain a much broader set of accessibility options (dyslexic font support, high contrast color options, easy font resizing, etc.) to further increase access.

Built With

+ 2 more
Share this project:

Updates