NLP and the Point of View
In Natural Language Processing a Point of View (POV) is the perspective of a sequence of words. Meaning it is the who said what to whom and in what context. In simple example:
Bob said "I am Robert" => assign: said
content: I am Robert
In NLP this information is needed to:
- Assessment of the quality, validity and truth of POV content information
- Logically resolve references in POV content (I, you, we, etc)
- Identify source of and access to information presented in POV to determine information access or blindness for other entities
- Determine if quoted content is a POV not a media title or scare quote
There are several other common situations that alter the POV of content that effect the assessment of the information or who has access to it.
“don’t quote me on that”
The simplest POV is quote with clear attribution optionally to a person or entity.
Bob told me “I don’t want any words from you”
With the communication partners mapped, the references ‘I’ and ‘you’ in POV content can be logically resolved.
Some say… hearsay
When what is said is really said to be said, may not be accurate and possibly adds no value… it is often hearsay. This content has a known speaker and claimed source but must be tagged as the content is suspect at best.
Bob said your the one that eat my last cookie.
All the lies they tell
Information that is known to be a lie, is not to be trusted. Identifying lies and mistruths is unfortunately a bit difficult. Lies POVs record information and keep it separated and tagged to know that it is not good data, and to remember the source.
Bob lied me “It wasn’t me that eat the last cookie!”
Everyone’s a comedian!
Some content is just funny… but generally the text of a joke is not to be considered good information. So like the lies we find, humor is identified when possible to keep our information trusted.
“why did the chicken cross the road?”
Fortunately questions are easy to identify, but who is asking them, of whom needs to be known as well. The information in a question is not known to be good until we have an answer… an honest answere.
Bob did you eat my last cookie?
Headers, footers, adverts, and more
All content has a POV, often just simply the author communicating to the reader or author to recipient like in an email, letter or PM.
But most more structured documents contain sections not written by the author, such as a header, footer, preface or even adverts. This content is often attributable only to a generic publisher, but must be properly mapped to a POV so it is clear that what every truth or quality we attribute to the author is not associated with this information.
Without mapping the POV for all the tokens, logical references resolution and information integrity are not possible for NLP. Additionally, it adds a lot of easy to use information for data mining and content analysis. Clearly there is a lot of work in the field remaining to be able to accurately map and understand lies and humor, but it is just a matter of time.
What can you do with access to this information?