Upload Code
loading-left
loading loading loading
loading-right

Loading

Profile
No self-introduction
codes (2)
Semantic text extraction in web pages
no vote
Application background This code we did as part of our minor project in Semantic Web Technologies subject at our college. This code was a very basic attempt to try to remove advertisements from web page and show only relevant text. We removed ads and flash and other javascript etc and took only text to show. This code uses python language as it provides lot of libraries to reduce coding effort from programmer side. Key Technology Web has become the largest information source with billions of pages. However, a web page usually contains some contents which are irrelevant with main topic. For example, there are so many multimedia advertising segments, unnecessary images, or navigation links in Web pages. These parts can seriously harm Web data mining, distract users from main topic, and influence PageRank. There are some existing approaches to discover informative
sushilkumarsah
2016-08-23
1
1
Semantic text extraction in web pages
no vote
Application background This code we did as part of our minor project in Semantic Web Technologies subject at our college. This code was a very basic attempt to try to remove advertisements from web page and show only relevant text. We removed ads and flash and other javascript etc and took only text to show. This code uses python language as it provides lot of libraries to reduce coding effort from programmer side. Key Technology Web has become the largest information source with billions of pages. However, a web page usually contains some contents which are irrelevant with main topic. For example, there are so many multimedia advertising segments, unnecessary images, or navigation links in Web pages. These parts can seriously harm Web data mining, distract users from main topic, and influence PageRank. There are some existing approaches to discover informative
sushilkumarsah
2016-08-23
0
1
No more~