All about Tor (and running a tor hidden service)

I have recently been thinking about the tor network again. This time the appeal is actually the increased (or perceived increased) barrier to entry. The purpose of increasing difficulty of access it twofold: it greatly reduces the number of visitors while (hopefully) increases the quality of the people visiting. I quite like the idea of "you can only talk to my server using the protocols I want you to use". Gopher, finger, bbs, and shell servers are examples of restricting users by evoking the arcane horror that is the scary world that is text-only computing. Tor and i2p are examples of restricting users by causing a "fear from ignorance" response in the types of people who think that visiting 4chan will get them a visit from the FBI and people who believe that the youtube trend of "dark web mystery boxes" is anything other than a series of staged and overacted videos that only exist for the purposes of making a quick cashgrab at the woooooo spooky internet!! trend.

The thing that interests me the most is applying the definitions of dark web and deep web to ordinary services many of us UNIX users use every day. The surface web is any indexed website that talks http. The deep web is any service that is unindexed and/or password protected. This means that any system running sshd that is also not recursively exposing the root directory via a http server is technically running a deep web service. The dark web is any server that is not indexed and requires some form of overlay network to access. If you've ever nested ssh sessions to access a machine behind your router you are, in fact, running a scary and illegal dark web service.

Hyperbole and dead-horse-beating aside, it is not illegal to use overlay networks and host hidden services for legal purposes. Protocol does not determine legality but the content hosted on and the activity on the server. Keep in mind that you have been added to an NSA watchlist by virtue of clicking on this post (or really anything in my website) for the same reasons that Linux Journal readers were flagged by the NSA for being "extremists" who "warranted extra surveillance". You will also be flagged for visiting torproject.org, for using the tor network, and for researching open source software due to skepticism of companies involved in the PRISM program. Welcome to the watchlist.

Tor for end users

Tor is an internet overlay network that can be conceptualized as a nested series of randomly selected proxies. This nested series is called a "tor circuit" where each hop in the chain is called a "relay". The first relay is either an "entry guard" which is a publicly listed node or a "bridge" which is a node that is not publicly listed. At the end of the circuit, there is either an exit node or a hidden service. Hidden services are servers that are listening on the tor network. An exit node is just a gateway proxy for the regular internet. Layers of encryption are added or removed at each hop in the chain. Between the starting node and ending node there are 1 or more relays.

The privacy strategy used by tor is to make every user look the same. There are many things that can break this model but it generally boils down to avoiding weak links in the tor network, selecting appropriate freedom respecting software, and following good operational security. All of these practices are not tor specific as they can have profound privacy implications when implemented in a normal (ie no tor) internet usage context.

Tor is not perfect

Tor is loud. Everyone knows you're using tor because the list of entry nodes is public. Using a bridge can mitigate this risk but can incur other risks. The process of getting a bridge without an existing tor connection can cause problems.

In addition to issues when entering the network there are significantly larger issues when exiting the network. In 2020 it was estimated that ~25% of tor exit nodes are controlled by a single malicious actor. This actor was capturing unencrypted traffic in an attempt to steal credentials and cryptocurrency.

Correlation attacks are another issue with exit nodes. This is only possible with a level of monitoring achievable by government agencies. This attack works by capturing all traffic into and out of the tor network. Packet content and timing (as well as things like unencrypted email headers) can all be used to correlate traffic and deanonymize users. Staying on .onion addresses will prevent any exit node related issues.

The tor project also provides some general tips for staying anonymous.

Software for users

I shouldn't have to say it but intuition is a rare commodity in the modern era. If you think that running tor atop proprietary software will keep you safe you have already lost. The only way to make a system running proprietary software private and secure is to open the apple system information menu/windows device manager, note every hardware device, then physically remove each of these devices from the motherboard. Most if not all vendors of proprietary software are involved with various government projects (such as PRISM) with a thinly veiled goal of backdooring end users in order to simplify mass surveillance.

You should be using torbrowser on a free UNIX like operating system such as Linux or BSD. Linux is preferable because tiny nuances in the networking stack of the various UNIXes can be used to deanonymize users, especially if the user is the only person accessing the tor network using something obscure like illumos from a residential IP. Virtualizing Linux is not an acceptable solution. A distribution like Tails can be a good choice due to it's default behavior of pushing all traffic over tor and not creating any persistent files. Torbrowser can be installed on most other Linux distributions by installing either of the torbrowser or torbrowser-launcher packages.

OPSEC

Operational security is a large topic but I generally describe it as the following:

What is my threat model?
What the are critical pieces of information that I don't want the adversary to know?
Where am I vulnerable?
How do I fix the vulnerabilities?
If I was the adversary how would I attack myself? GOTO 1;

My personal threat model is resisting data hungry corporations and avoiding the types of people who are addicted to data hungry corporations. In the process I reduce my overall "data leakage" and remove a lot of of the drive by attacks. One way of implementing practices to mitigate these threats is to practice pseudonymity. Reserve legal identities for real life legal contexts and use various disconnected personas in internet contexts. It can be useful to keep track of which personas you link to other personas so that you don't end up dragging garbage into identities. On the internet, nobody knows you're a dog.

Other OPSEC issues are fairly intuitive as they include metadata, not posting your face online, not posting your address online, not posting your HAM callsign online, etc. Advanced techniques for deanonymizing users can include writing analysis. You can mitigate this risk by practicing simple English and brevity.

Tor for webmasters

For the remainder of this post I will be documenting the process of mirroring an existing website to a tor hidden service. Of course, this will be OpenBSD specific. I assume the reader (target is future me) has an functioning OpenBSD httpd configuration.

Installing and configuring tor

# pkg_add tor torsocks
# rcctl enable tor
# rcctl start tor

Edit /etc/tor/torrc:

#HiddenServiceDir /var/tor/my-hidden-service/
#HiddenServicePort externalIP 127.0.0.1:internalIP
HiddenServiceDir /var/tor/my-hidden-service/
HiddenServicePort 8080 127.0.0.1:8081

You will also need to modify /etc/pf.conf accordingly:

set skip on lo                                          
tcp_services="{ssh, http, https, 8080}"                 
block in all                                            
pass in proto tcp to any port $tcp_services keep state  
pass out all

And reload pf's ruleset:

# pfctl -e -f /etc/pf.conf
# pfctl -sr
block drop in all                                       
pass in proto tcp from any to any port = 22 flags S/SA  
pass in proto tcp from any to any port = 80 flags S/SA  
pass in proto tcp from any to any port = 443 flags S/SA 
pass in proto tcp from any to any port = 8080 flags S/SA
pass out all flags S/SA

restart tor and get the generated .onion URL. The actual .onion URL will be much longer than the example.

# rcctl restart tor && rcctl check tor
# cat /var/tor/my-hidden-service/hostname
myURL.onion

Modify /etc/httpd.conf:

server "myURL.onion" {
	listen on * port 8081
	root "/myURL.onion"
}

We will also need to restart httpd and create files

# rcctl restart httpd && rcctl check httpd
# mkdir /var/www/myURL.onion
# echo "it works" > /var/www/muURL.onion/index.html
# torsocks curl muURL.onion
it works

In actual production I find that it's easier to just copy my existing vhost configuration into second vhost that listens on loopback:8081 than to proxy tor traffic to my ssl vhost. I'm sure there are fancier ways of doing this with relayd but I had some issues trying to get this working with my existing httpd configuration. The existing configuration redirects all requests on port 80 to port 443 and I'm not sure if there's a way to do this elegantly with tor traffic using relayd. Tor traffic is already encrypted so it;s a non issue either way.

For readers attempting to run their own hidden service: proceed with caution. In the process of researching for this article I found many examples of how a bad webserver configuration can expose information about your server and reveal it's IP address, webserver, etc. Additionally, the configuration in this article uses ports rather than UNIX sockets for the HiddenServicePort line. You should use UNIX sockets instead of ports if you don't want your tor service exposed to your local network.