Random ...
 
December 2017
S M T W T F S
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31            
Tags ...
Links



python+zope++:: An even smarter way to put the smarts in our apps
Posted at 07.May,2009 14:15  Comments 0 / Trackbacks 0 / Like this post!
Technorati tag(s):

Further research while trying reverend brought me to opencalais. opencalais.org is a service by Thomson Reuters that will try and categorize or find meaning to texts.

Here's what they say:

    The Calais Web Service automatically creates rich semantic metadata for the content you submit – in well under a second. Using natural language processing, machine learning and other methods, Calais analyzes your document and finds the entities within it. But, Calais goes well beyond classic entity identification and returns the facts and events hidden within your text as well.

Here's how we can access opencalais with python.

  • get python-calais
  • get our opencalais api
  • prepare some texts to examine
 []$ ipython       

 In [1]: import calais
 In [2]: c=calais.Calais('yourapikey')

 In [3]: txt="""Got myself a webcam, Logitech Go, or Logitech Express elsewhere.
   ...:
   ...: Initially, I hoped that everything work off the bat. But no dice. A bt oftwiddling was needed.
   ...:
   ...: lsusb shows:
   ...:
   ...:       Bus 001 Device 002: ID 046d:092f Logitech, Inc.
   ...:
   ...: I initially compiled and installed qc-usb. That was the wrong one. After a few google search, spca5x was the one I needed.
   ...:
   ...: a yum search spca5x revealed that I should use gspcav1. After installation, plugged in my webcam, and all is peachy.
   ...:
   ...: with kopete, I started a webcam session with someone on yahoo, and all works to perfection!
   ...:
   ...: ain't that cool or ain't that cool!"""

 In [4]: res=c.analyze(txt)
 In [5]: res.entities
 Out[5]:             
 [{'__reference': 'http://d.opencalais.com/genericHasher-1/635db363-c5af-38d1-8f03-b62868130722',                                                                    
  '_type': 'IndustryTerm',                                                        
  'instances': [{'detection': '[google search, spca5x was the one I needed.\n\na ]yum search spca5x[ revealed that I should use gspcav1. After]',                   
                 'exact': 'yum search spca5x',                                    
                 'length': 17,                                                    
                 'offset': 358,                                                   
                 'prefix': 'google search, spca5x was the one I needed.\n\na ',   
                 'suffix': ' revealed that I should use gspcav1. After'}],        
  'name': 'yum search spca5x',                                                    
  'relevance': 0.17699999999999999,                                               
  'resolutions': []},                                                             
  {'__reference': 'http://d.opencalais.com/comphash-1/f9228e55-d2a2-383a-bc24-f76863586c46',                                                                         
  '_type': 'Company',                                                             
  'instances': [{'detection': '[Got myself a webcam, ]Logitech[ Go, or Logitech Express elsewhere.\n\nInitially, I]',                                               
                 'exact': 'Logitech',                                             
                 'length': 8,                                                     
                 'offset': 21,                                                    
                 'prefix': 'Got myself a webcam, ',                               
                 'suffix': ' Go, or Logitech Express elsewhere.\n\nInitially, I'},
                {'detection': '[myself a webcam, Logitech Go, or ]Logitech[ Express elsewhere.\n\nInitially, I hoped that]',                                        
                 'exact': 'Logitech',                                             
                 'length': 8,                                                     
                 'offset': 37,                                                    
                 'prefix': 'myself a webcam, Logitech Go, or ',                   
                 'suffix': ' Express elsewhere.\n\nInitially, I hoped that'},     
                {'detection': '[shows:\n    \n  Bus 001 Device 002: ID 046d:092f ]Logitech, Inc[.\n\nI initially compiled and installed qc-usb.]',                  
                 'exact': 'Logitech, Inc',                                        
                 'length': 13,                                                    
                 'offset': 216,                                                   
                 'prefix': 'shows:\n    \n  Bus 001 Device 002: ID 046d:092f ',   
                 'suffix': '.\n\nI initially compiled and installed qc-usb.'}],
  'name': 'Logitech Inc',
  'nationality': 'N/A',
  'relevance': 0.59999999999999998,
  'resolutions': [{'name': 'Logitech International S.A.',
                   'score': 1,
                   'ticker': 'LOGN'}]},
 {'__reference': 'http://d.opencalais.com/genericHasher-1/297cba0d-593e-3ae0-ad38-554d3114b0a1',
  '_type': 'IndustryTerm',
  'instances': [{'detection': '[qc-usb. That was the wrong one. After a few ]google search[, spca5x was the one I needed.\n\na yum search]',
                 'exact': 'google search',
                 'length': 13,
                 'offset': 311,
                 'prefix': 'qc-usb. That was the wrong one. After a few ',
                 'suffix': ', spca5x was the one I needed.\n\na yum search'}],
  'name': 'google search',
  'relevance': 0.26000000000000001,
  'resolutions': []}]

Look at other methods too, especially analyze_file() and analyze_url()

Next step? A zope 2 product, hopefuly.


Bookmark and Share

Is this entry helpful? Comments/Donate/Click some google ads.  
Trackback is http://myzope.kedai.com.my/blogs/kedai/235/tbping 

Comments
Post a comment