Text mining: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Robert Badgett
(New page: '''Text mining''' "involves analysing a large collection of documents to discover previously unknown information".<ref name="titleText Mining briefing paper : JISC">{{cite web |url=http:...)
 
imported>Robert Badgett
No edit summary
Line 1: Line 1:
'''Text mining''' "involves analysing a large collection of documents to discover  previously unknown information".<ref name="titleText Mining briefing paper : JISC">{{cite web |url=http://www.jisc.ac.uk/publications/publications/pub_textmining.aspx |title=Text Mining briefing paper : JISC |accessdate=2008-01-22 |author= |authorlink= |coauthors= |date= |format= |work= |publisher= |pages= |language= |archiveurl= |archivedate= |quote=}}</ref>
'''Text mining''' "involves analysing a large collection of documents to discover  previously unknown information".<ref name="titleText Mining briefing paper : JISC">{{cite web |url=http://www.jisc.ac.uk/publications/publications/pub_textmining.aspx |title=Text Mining briefing paper : JISC |accessdate=2008-01-22 |author= |authorlink= |coauthors= |date= |format= |work= |publisher= |pages= |language= |archiveurl= |archivedate= |quote=}}</ref>


Coping with the many ways that a concept may be expressed in text (Zipf's law) remains a barrier in achieving results with text mining that are as good a human-curated results.<ref name="pmid15719064">{{cite journal |author=Rebholz-Schuhmann D, Kirsch H, Couto F |title=Facts from text--is text mining ready to deliver? |journal=PLoS Biol. |volume=3 |issue=2 |pages=e65 |year=2005 |pmid=15719064 |doi=10.1371/journal.pbio.0030065 |issn=}[http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=548955 PubMed Central]</ref>
Coping with the many ways that a concept may be expressed in text (Zipf's law) remains a barrier in achieving results with text mining that are as good a human-curated results.<ref name="pmid15719064">{{cite journal |author=Rebholz-Schuhmann D, Kirsch H, Couto F |title=Facts from text--is text mining ready to deliver? |journal=PLoS Biol. |volume=3 |issue=2 |pages=e65 |year=2005 |pmid=15719064 |doi=10.1371/journal.pbio.0030065 |issn=}}[http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=548955 PubMed Central]</ref>


==References==
==References==
Line 7: Line 7:


==External links==
==External links==
National Institute         of Standards and Technology's (NIST) Information Technology Laboratory (ITL): [http://www.itl.nist.gov/iaui/894.02/related_projects/muc/ Introduction to Information Extraction]
* National Institute of Standards and Technology's (NIST) Information Technology Laboratory (ITL): [http://www.itl.nist.gov/iaui/894.02/related_projects/muc/ Introduction to Information Extraction]
[[Category:CZ Live]] [[Category:Library and Information Science Workgroup]]
[[Category:CZ Live]] [[Category:Library and Information Science Workgroup]]

Revision as of 14:07, 22 January 2008

Text mining "involves analysing a large collection of documents to discover previously unknown information".[1]

Coping with the many ways that a concept may be expressed in text (Zipf's law) remains a barrier in achieving results with text mining that are as good a human-curated results.[2]

References

  1. Text Mining briefing paper : JISC. Retrieved on 2008-01-22.
  2. Rebholz-Schuhmann D, Kirsch H, Couto F (2005). "Facts from text--is text mining ready to deliver?". PLoS Biol. 3 (2): e65. DOI:10.1371/journal.pbio.0030065. PMID 15719064. Research Blogging. PubMed Central

External links