Browse Source

Add pass-through whitelisting functionality (#16)

* Added domain whitelisting to allow proxy to be bypassed for specific sites

* Update README.md to note whitelisting functionality

* Added some additional retro-friendly domains to whitelist.txt
ict 2 years ago
parent
commit
6239389689
3 changed files with 20 additions and 0 deletions
  1. 1 0
      README.md
  2. 11 0
      waybackproxy.py
  3. 8 0
      whitelist.txt

+ 1 - 0
README.md

@@ -15,6 +15,7 @@ WaybackProxy is a retro-friendly HTTP proxy which retrieves pages from the [Inte
 		* The easiest way to set up a transparent WaybackProxy is to run it on port 80 ([this cannot be done on Linux without security implications](https://unix.stackexchange.com/questions/87348/capabilities-for-a-script-on-linux)\), set up a fake DNS server - such as `dnsmasq -A "/#/ip"` where `ip` is the IP of the system running WaybackProxy - to redirect all requests to the proxy, and point client machines at that DNS server.
 4. Try it out! You can edit most settings that are in `config.json` by browsing to http://web.archive.org while on the proxy, although you must edit `config.json` to make them permanent.
 5. Press Ctrl+C to stop the proxy
+6. Exclude domains from being proxied by adding them to `whitelist.txt`
 
 ## Docker Container
 

+ 11 - 0
waybackproxy.py

@@ -28,6 +28,15 @@ class Handler(socketserver.BaseRequestHandler):
 		# Store a local pointer to SharedState.
 		self.shared_state = shared_state
 
+		# Read domain whitelist file.
+		try:
+			whitelist_file = open("whitelist.txt","r")
+			whitelist_data = whitelist_file.read()
+			self.whitelist = whitelist_data.split("\n")
+			whitelist_file.close()
+		except:
+			self.whitelist = list()
+
 	def handle(self):
 		"""Handle a request."""
 
@@ -113,6 +122,8 @@ class Handler(socketserver.BaseRequestHandler):
 				pac += '''}\r\n'''
 				self.request.sendall(pac.encode('ascii', 'ignore'))
 				return
+			elif hostname in self.whitelist:
+				_print('[>]', archived_url,'(proxy bypassed by whitelist.txt)')
 			elif hostname == 'web.archive.org':
 				if path[:5] != '/web/':
 					# Launch settings if enabled.

+ 8 - 0
whitelist.txt

@@ -0,0 +1,8 @@
+frogfind.com
+www.frogfind.com
+www.floodgap.com
+gopher.floodgap.com
+68k.news
+www.68k.news
+yarchive.net
+www.yarchive.net