Skip to main content

Home/ VirgoLab/ Group items tagged papers

Rss Feed Group items tagged

Roger Chen

PageRank in academic publishing « Peter Rohde's Blog - 0 views

  • The standard measure scientists use to judge the importance of scientific papers is a simple citation count. That is, how many other papers cite the paper in question? While this measure has its merits, it has one fundamental flaw - not all citations are equal.
  • Numerous authors/bloggers have advocated using a PageRank-like index for quantifying the importance of papers or journals
  • To represent the web we use a directed graph, where the edges carry a direction.
  • ...4 more annotations...
  • The goal of the PageRank algorithm is two-fold. We wish to construct a measure of relevance that, first, is related to how many incoming links a site has, and second, what the importance of the source of those links was.
  • Well scientific papers can be mapped to a graph in a similar way to web-sites. Specifically, vertices in the graph would represent papers, and edges citations. The PageRank algorithm can be applied out-of-the-box.
  • First of all, one could discount self-citations from the index
  • A second variation that one might try is to add a time bias when calculating the index, such that links from more recent papers carry more weight than from older papers.
  •  
    Numerous authors/bloggers have advocated using a PageRank-like index for quantifying the importance of papers or journals.
Roger Chen

Write good papers - 0 views

  • be ambitious.
  • Most papers should make a single point.
  • How is your contribution different from what has been said a thousand times before?
  • ...2 more annotations...
  • A sexy start: tell the reader early why he should read your paper. Don’t summarize, sell! A good abstract answers the question why should I read this paper?,
  • 5. What a good paper should not contain Weak unnecessary results: if you derived ten theorems but only one is necessary, throw the rest of them in your drawers. I do not want to know about useless results! Technical details: technical papers made of several small ideas are usually not interesting.
Roger Chen

Paper: MapReduce: Simplified Data Processing on Large Clusters | High Scalability - 0 views

  • Some interesting stats from the paper: Google executes 100k MapReduce jobs each day; more than 20 petabytes of data are processed per day; more than 10k MapReduce programs have been implemented; machines are dual processor with gigabit ethernet and 4-8 GB of memory.
  •  
    Google executes 100k MapReduce jobs each day; more than 20 petabytes of data are processed per day; more than 10k MapReduce programs have been implemented; machines are dual processor with gigabit ethernet and 4-8 GB of memory.
Roger Chen

How to Maximize Citations « Apperceptual - 0 views

  • Why should we want our papers to be highly cited? I assume here that we want our work to influence other researchers, and that citation count is a reasonable estimate of influence.
  •  
    Why should we want our papers to be highly cited? I assume here that we want our work to influence other researchers, and that citation count is a reasonable estimate of influence.
Roger Chen

Google Experiments With Next Generation Image Search - 0 views

  •  
    Two Google scientists presented a paper at WWW 2008 held in Beijing last week that outlines their vision for the future of image search.
Roger Chen

Geeking with Greg: Finding the location of interests and objects from search logs - 0 views

  •  
    A paper at WWW 2008, "Spatial Variation in Search Engine Queries" (PDF), by Lars Backstrom, Jon Kleinberg, Ravi Kumar, and Jasmine Novak offered many clever examples of using where people are when they do a web search both to determine when interest in a topic is geographically isolated and to estimate the physical location of objects.
Roger Chen

Word Count as a Measure of Quality on Wikipedia-Beyond Search - 0 views

  •  
    刚刚结束的 WWW2008 会议里有一篇 short paper,《Size Matters: Word Count as a Measure of Quality on Wikipedia》。里面给出了一个令人吃惊的实验结果,在进行 Wikipedia 的文章质量评价时,仅仅只需要使用"Word Count"一个参数,就可以取得 96.31% 的准确率!这个结果,比许多使用复杂模型的算法,都要好!
Roger Chen

Geeking with Greg: Yahoo, Hadoop, and Pig Latin - 0 views

  •  
    Chris Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, and Andrew Tomkins from Yahoo have an upcoming paper at SIGMOD 2008, "Pig Latin: A Not-So-Foreign Language for Data Processing" (PDF), that details Yahoo's work to build a powerful parallel processing language on top of Hadoop.
Roger Chen

Key difference between Web 1.0 and Web 2.0 - 0 views

  •  
    Web 2.0 is a buzzword introduced in 2003-04 which is commonly used to encompass various novel phenomena on the World Wide Web. Although largely a marketing term, some of the key attributes associated with Web 2.0 include the growth of social networks, bi-directional communication, various 'glue' technologies, and significant diversity in content types. We are not aware of a technical comparison between Web 1.0 and 2.0. While most of Web 2.0 runs on the same substrate as 1.0, there are some key differences. We capture those differences and their implications for technical work in this paper. Our goal is to identify the primary differences leading to the properties of interest in 2.0 to be characterized. We identify novel challenges due to the different structures of Web 2.0 sites, richer methods of user interaction, new technologies, and fundamentally different philosophy. Although a significant amount of past work can be reapplied, some critical thinking is needed for the networking community to analyze the challenges of this new and rapidly evolving environment.
Roger Chen

How to Write a Scientific Paper - 0 views

  •  
    Annals of Improbable Research, Vol. 2, No. 5, pg. 8.
Roger Chen

Microsoft on Organizing Information in Storylines -SEO by the SEA - 0 views

  •  
    A newly published patent application from Microsoft takes an interesting spin on presenting information, pulling together news from a mix of sources to present topics in storylines, and providing ways to have that information delivered to us over computers, smart phones, watch interfaces, and in other ways.
Roger Chen

The Noisy Channel: Special Issues of Information Processing & Management - 0 views

  •  
    Max Wilson at the University of Southampton recently called my attention to a pair of special issues of Information Processing & Management. The first is on Evaluation of Interactive Information Retrieval Systems; the second is on Evaluating Exploratory Search Systems. Both are available online at ScienceDirect.
Roger Chen

Scimago Journal & Country Rank - 0 views

  •  
    The SCImago Journal & Country Rank is a portal that includes the journals and country scientific indicators developed from the information contained in the Scopus® database (Elsevier B.V.). These indicators could be used to assess and analyze scientific domains.
1 - 20 of 41 Next › Last »
Showing 20 items per page