Wednesday, June 25, 2008

Evri Exposes the Web that Always Was

There are moments in my life that are incredibly vivid -- just thinking of them draws out a swirl of colors and sights and sounds in my mind. I recall being lost, at the age of 3, and escorted by a policeman to our home. I recall flying through the air for what felt like hours, until my shoulder smacked into a thick fir tree -- the sound of my bike crashing behind me - the pain and angst during the many hours it took to hike out. I remember huddling over a workstation in the basement of the University of Wisconsin engineering computer lab - mouth wide open, staring at Mosaic, while watching a fellow graduate student click through the first few web sites I ever saw. I remember the first time my brilliant former boss showed me an early prototype of an NLP based search tool about 7 years ago -- I saw for the first time a world of information, an intriguing world of information, connecting things, or entities, in a direct, and revealing way I had never seen before. Seeing a CNN video play in that first web browser, and wandering through the interconnectedness of the web, I knew the world as I had thought of it had changed. Seeing the web of entities for the first time, I knew the world as I had thought of it had changed again.

If you're like most people, when you think of the web, you think of a bunch of websites connected via hyperlinks to other websites. Indeed this web's incredible value we simply take for granted now. But another web exists, and I believe its value is equally incredible though untapped -- it's a web that has existed since humans first wrote down a sentence -- it's the web of things, connected via language itself. It's this web that we humans use to understand and make sense of the world. So if the world wide web consisted of 1 website with 1 document containing 1 sentence -- the sentence being: "Evri launches its Beta," -- this web, the entity web, or as my CEO Neil calls it, the "data graph of the web," would consist of 2 nodes: Evri and Beta, with grammatical clause level linkage: launch. One of Noam Chomsky's great contributions of the non political flavor, was carving out a theory showing all humans communicate in languages representable by a common basic grammatical structure. At a most basic level, humans define a source of some action, the action itself, and then the object of that action. For example, a caveman points at himself, then points at his stomach, then points at some meat on a rock -- message is clear: me want meat. Of course most of us use a more complex form of language that transcends caveman-speak.

Dealing with the complexities of language is one of the things that keeps many of us Evri-ites busy -- doing things like getting machines to learn how to resolve pronouns such as he/she/it and other anaphora like the lawyer, the president, or the company. A human can read a body of text about OJ Simpson and understand that "the lawyer" referred to throughout the article, actually refers to Johnny Cochran -- any system that tries to expose the entity web needs to do the same. Humans easily perform other tasks that are difficult for machines, but essential to the entity web - for example, if you're talking about Will Smith, most humans can easily figure out whether you're referring to Will Smith the actor, entertainer, and musician or Will Smith the football player of the New Orleans Saints. Humans are pretty good at using the context of language to assist in this type of disambiguation task. We've made good strides at getting machines to perform this task pretty well by following the general approach humans follow, that is, paying attention to the surrounding context and by leveraging our language model. For example, we know actors tend to perform actions like: act, perform in, star in, against objects like: movies, shows or events. We know football players perform very different actions like: scoring, tackling, or passing against objects like field goals, players or footballs. In addition to all the excitement with language, there's the great fun that has kept me busy for many a long day and late night -- whacking through the vast ocean of human communication fast and efficiently -- as many companies have shown with the world wide web, its not enough to know a web exists, its essential to bring order and the efficient ability to navigate.

So we're opening up some early glimpses into this web connecting all things; I'd love for you to tell us what you think. You can sign up at www.evri.com. And on a final note: I recently watched a great film called Heavy Metal In Baghdad that features the amazing story of Acrassicauda, a group of war weary kids from Baghdad that simply want to rock out; here's a screenshot from their page on our new Beta:

3 comments:

Novario12 said...

Wow. What a poetic description of the semantic web. Can't wait to play with it! Cheers.

Naveen said...

Just checked out your work. Nice! Thanks for the invite. What I find fascinating is the actual conenctions. That's a lot of info I can navigate in a tiny area. Looking forward to when I can run into these pages on a Google search.

Anonymous said...

The beta's an interesting take on NLP search. Cool to see how different the app is from the work you describe in the Springer NLP / text mining chapter. Tough to find any entities though without a search box.

Related Posts with Thumbnails

Liked what you read? Tell your friends

More info about content in my post