Skip to main content

Home/ interesting_sites/ Group items matching "text" in title, tags, annotations or url

Group items matching
in title, tags, annotations or url

Sort By: Relevance | Date Filter: All | Bookmarks | Topics Simple Middle
pagetribe .

Re: [Swftools-common] PDF2SWF and getTextSnapShot() - 0 views

  • that said, here there is my ActionScript code to highlight the text inside a PDF page. It works with Flash 8 or previous ONLY because the new Flash 9 has a different AVM interpreter (AVM2) and many things have changed. Please note: - ``txt`` is the text to search and highlight inside your page - ``mc`` points to _root.text that's where I was keeping my swf/pdf page you should change that so it references yours. Here is the code: function hltext ( txt ) { var mc = _root.text; var my_snap:textSnapshot = mc.gettextSnapshot(); var start_pos:Number = 0; start_pos = my_snap.findtext ( start_pos, txt, false ); while ( start_pos > 0 ) { trace ( start_pos ); my_snap.setSelected( start_pos, start_pos + txt.length, true ); start_pos += txt.length; start_pos = my_snap.findtext ( start_pos, txt, false ); } }
  •  
    that said, here there is my ActionScript code to highlight the text inside a PDF page. It works with Flash 8 or previous ONLY because the new Flash 9 has a different AVM interpreter (AVM2) and many things have changed. Please note: - ``txt`` is the text to search and highlight inside your page - ``mc`` points to _root.text that's where I was keeping my swf/pdf page you should change that so it references yours. Here is the code: function hltext ( txt ) { var mc = _root.text; var my_snap:TextSnapshot = mc.getTextSnapshot(); var start_pos:Number = 0; start_pos = my_snap.findText ( start_pos, txt, false ); while ( start_pos > 0 ) { trace ( start_pos ); my_snap.setSelected( start_pos, start_pos + txt.length, true ); start_pos += txt.length; start_pos = my_snap.findText ( start_pos, txt, false ); } }
pagetribe .

http://nltk.googlecode.com/svn/trunk/doc/book/ch01.html - 0 views

  • We can count how often a word occurs in a tex
  • Adding two lists creates a new list
  • count the occurrences of a particular word using text1.count('heaven')
  • ...18 more annotations...
  • By convention, m:n means elements m…n-1
  • A consequence of this last change is that the list only has four elements, and accessing a later value generates an error
  • We can join the words of a list to make a single string, or split a string into a list, as follows:
  • 'Monty Python'.split()
  • frequency distribution
  • frequency of each vocabulary item
  • find the 50 most frequent words
  • hese very long words are often hapaxes (i.e. unique) and perhaps it would be better to find frequently occurring long words.
  • Here are all words from the chat corpus that are longer than 7 characters, that occur more than 7 times:   >>> fdist5 = FreqDist(text5) >>> sorted([w for w in set(text5) if len(w) > 7 and fdist5[w] > 7]) ['#14-19teens', '#talkcity_adults', '((((((((((', '........', 'Question', 'actually', 'anything', 'computer', 'cute.-ass', 'everyone', 'football', 'innocent', 'listening', 'remember', 'seriously', 'something', 'together', 'tomorrow', 'watching'] >>>
  • The collocations() function does this for us
  • find bigrams that occur more often than we would expect based on the frequency of individual words.
  • fdist = FreqDist(samples) create a frequency distribution containing the given samples fdist.inc(sample) increment the count for this sample fdist['monstrous'] count of the number of times a given sample occurred fdist.freq('monstrous') frequency of a given sample fdist.N() total number of samples fdist.keys() the samples sorted in order of decreasing frequency for sample in fdist: iterate over the samples, in order of decreasing frequency fdist.max() sample with the greatest count fdist.tabulate() tabulate the frequency distribution fdist.plot() graphical plot of the frequency distribution fdist.plot(cumulative=True) cumulative plot of the frequency distribution fdist1 < fdist2 test if samples in fdist1 occur less frequently than in fdist2
  • it goes through each word in text1, assigning each one in turn to the variable w and performing the specified operation on the variable.
  • The above notation is called a "list comprehension"
  • [f(w) for ...] or [w.f() for ...],
  • Now that we are not double-counting words like This and this
  • by filtering out any non-alphabetic items:   >>> len(set([word.lower() for word in text1 if word.isalpha()]))
  • A collocation is a sequence of words which occur together unusually often. Thus red wine is a collocation, while the wine is not. A characteristic of collocations is that they are resistant to substitution with words that have similar senses — maroon wine sounds definitely odd.
pagetribe .

RE: [Swftools-common] PDF2SWF and getTextSnapShot() - 0 views

  • For everyone else make sure you follow these steps: 1. Use Flash 8 or previous version (I used 6) with the -T command : pdf2swf -T 6 2. Use the -f command for full fonts : pdf2swf -f 3. Test your outputted swf with: swfdump -t filename.swf , you should see a list of DEFINETEXT statements and the corresponding TEXT. Due to a font conflict I was seeing DEFINETEXT followed by jumbled up TEXT on my first pdf. 4. Test your outputted swf with: swfstrings filename.swf, you should see your TEXT and a LOT of ???????s. Again, I had garbage TEXT when trying to convert my original PDF. If the swfdump and swfstrings tests are working, load your pdf2swf.swf into Flash. Publish it for 8. I loaded it into a movieclip on my root timeline called 'loader' : loader.loadMovie("pdf2swf_files/6new.swf"); I have a movieclip called 'searchTEXT_mc' and have the following code for it: searchTEXT_mc.onRelease = function() { hlTEXT ("wonderful"); } And then the hlTEXT is as Fabio provided. This will yellow-highlight all the occurrences of the search string: function hlTEXT ( txt ) { trace("hlTEXT"); var mc = _root.loader; var my_snap:TEXTSnapshot = mc.getTEXTSnapshot(); var start_pos:Number = 0; start_pos = my_snap.findTEXT ( start_pos, txt, false ); trace("start_pos : " + start_pos); while ( start_pos > 0 ) { trace ( start_pos ); my_snap.setSelected( start_pos, start_pos + txt.length, true ); start_pos += txt.length; start_pos = my_snap.findTEXT ( start_pos, txt, false ); } } If anyone would like some sample files give me a shout,
pagetribe .

Upgrading our RSS feeds | Help | guardian.co.uk - 0 views

  •  
    Outlines some of the uses of the full text rss available across all of their content, site wide.
pagetribe .

Django 1.1 Talk Text - excess.org - 0 views

  •  
    Custom Upload Handling
pagetribe .

http://nltk.googlecode.com/svn/trunk/doc/book/ch02.html - 0 views

  • Loading your own Corpus If you have a collection of text files that you would like to access using the above methods, you can easily load them with the help of NLTK's PlaintextCorpusReader as follows:
pagetribe .

Revizr - Collaboration with Ownership - current - 0 views

  •  
    Online Document Collaboration
pagetribe .

Chapter 3: Views and URLconfs - 0 views

  • Now that we’ve designated a wildcard for the URL, we need a way of passing that wildcard data to the view function, so that we can use a single view function for any arbitrary hour offset. We do this by placing parentheses around the data in the URLpattern that we want to save.
  • we’re using parentheses to capture data from the matched text.
  •  
    *
1 - 12 of 12
Showing 20 items per page