You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
*[werkzeug](https://github.com/pallets/werkzeug) - A WSGI utility library for Python that powers Flask and can easily be embedded into your own projects.
1156
1152
1157
1153
## Web Asset Management
1158
1154
@@ -1170,15 +1166,12 @@ Code Formatters
1170
1166
1171
1167
*Libraries for extracting web contents.*
1172
1168
1173
-
*[Haul](https://github.com/vinta/Haul) - An Extensible Image Crawler.
1174
1169
*[html2text](https://github.com/Alir3z4/html2text) - Convert HTML to Markdown-formatted text.
1175
1170
*[lassie](https://github.com/michaelhelmick/lassie) - Web Content Retrieval for Humans.
1176
1171
*[micawber](https://github.com/coleifer/micawber) - A small library for extracting rich content from URLs.
1177
1172
*[newspaper](https://github.com/codelucas/newspaper) - News extraction, article extraction and content curation in Python.
1178
-
*[python-goose](https://github.com/grangier/python-goose) - HTML Content/Article Extractor.
1179
1173
*[python-readability](https://github.com/buriy/python-readability) - Fast Python port of arc90's readability tool.
1180
1174
*[requests-html](https://github.com/kennethreitz/requests-html) - Pythonic HTML Parsing for Humans.
1181
-
*[sanitize](https://github.com/Alir3z4/python-sanitize) - Bringing sanity to world of messed-up data.
1182
1175
*[sumy](https://github.com/miso-belica/sumy) - A module for automatic summarization of text documents and HTML pages.
1183
1176
*[textract](https://github.com/deanmalmgren/textract) - Extract text from any document, Word, PowerPoint, PDFs, etc.
1184
1177
*[toapi](https://github.com/gaojiuli/toapi) - Every web site provides APIs.
@@ -1188,14 +1181,13 @@ Code Formatters
1188
1181
*Libraries to automate data extraction from websites.*
1189
1182
1190
1183
*[cola](https://github.com/chineking/cola) - A distributed crawling framework.
*[grab](https://github.com/lorien/grab) - Site scraping framework.
1186
+
*[MechanicalSoup](https://github.com/MechanicalSoup/MechanicalSoup) - A Python library for automating interaction with websites.
1195
1187
*[portia](https://github.com/scrapinghub/portia) - Visual scraping for Scrapy.
1196
1188
*[pyspider](https://github.com/binux/pyspider) - A powerful spider system.
1197
-
*[RoboBrowser](https://github.com/jmcarp/robobrowser) - A simple, Pythonic library for browsing the web without a standalone web browser.
1198
-
*[Scrapy](https://scrapy.org/) - A fast high-level screen scraping and web crawling framework.
1189
+
*[robobrowser](https://github.com/jmcarp/robobrowser) - A simple, Pythonic library for browsing the web without a standalone web browser.
1190
+
*[scrapy](https://scrapy.org/) - A fast high-level screen scraping and web crawling framework.
1199
1191
1200
1192
## Web Frameworks
1201
1193
@@ -1215,8 +1207,8 @@ Code Formatters
1215
1207
1216
1208
*Libraries for working with WebSocket.*
1217
1209
1218
-
*[AutobahnPython](https://github.com/crossbario/autobahn-python) - WebSocket & WAMP for Python on Twisted and [asyncio](https://docs.python.org/3/library/asyncio.html).
1219
-
*[Crossbar](https://github.com/crossbario/crossbar/) - Open-source Unified Application Router (Websocket & WAMP for Python on Autobahn).
1210
+
*[autobahn-python](https://github.com/crossbario/autobahn-python) - WebSocket & WAMP for Python on Twisted and [asyncio](https://docs.python.org/3/library/asyncio.html).
1211
+
*[crossbar](https://github.com/crossbario/crossbar/) - Open-source Unified Application Router (Websocket & WAMP for Python on Autobahn).
1220
1212
*[django-channels](https://github.com/django/channels) - Developer-friendly asynchrony for Django.
1221
1213
*[django-socketio](https://github.com/stephenmcd/django-socketio) - WebSockets for Django.
1222
1214
*[WebSocket-for-Python](https://github.com/Lawouach/WebSocket-for-Python) - WebSocket client and server library for Python 2 and 3 as well as PyPy.
0 commit comments