Last updated: July 4th, 2025
domain-sift
is a Perl script that extracts unique domains from at
least one provided file and prints them to standard output in a given
format. If you don’t provide a file, domain-sift
reads from standard
input (STDIN) instead.
One use of this utility: extract domains from blocklists that contain known malicious or otherwise undesirable domains, then format them so that a DNS (Domain Name System) resolver can block those domains.
|-- Changes
|-- LICENSE
|-- MANIFEST
|-- Makefile.PL
|-- README.md
|-- bin
| `-- domain-sift
|-- lib
| `-- Domain
| |-- Sift
| | |-- Manipulate.pm
| | `-- Match.pm
| `-- Sift.pm
`-- t
|-- 00-load.t
|-- Domain-Sift-Manipulate.t
|-- Domain-Sift-Match.t
|-- manifest.t
|-- pod-coverage.t
`-- pod.t
To install domain-sift
, download the most recent
release and run the
following commands inside the source directory. Note that domain-sift
requires Perl 5.36 or later, since subroutine signatures are no longer
experimental in that release.
$ perl Makefile.PL
$ make
$ make test
# make install
After installation, you can read the documentation with perldoc
. man
often works as well.
$ perldoc Domain::Sift
$ perldoc Domain::Sift::Match
$ perldoc Domain::Sift::Manipulate
$ perldoc domain-sift
Here’s how to use domain-sift
with
unwind(8)
on OpenBSD.
$ domain-sift /path/to/blocklist_source > blocklist
/etc/blocklist
:# mv blocklist /etc/blocklist
unwind.conf
to include your new blocklist:block list "/etc/blocklist"
unwind
:# rcctl restart unwind
Here’s how to use domain-sift
with unbound(8)
on OpenBSD.
$ domain-sift -f unbound /path/to/blocklist_source > blocklist
/var/unbound/etc
.# mv blocklist /var/unbound/etc/blocklist
unbound.conf
to include your new blocklist:include: "/var/unbound/etc/blocklist"
# rcctl restart unbound
domain-sift
also supports the Response Policy Zone (RPZ) format. This
Internet Draft defines the RPZ
format.
With RPZ, you can create DNS blocking policies in a standardized way.
You can even block wildcarded domains (*.example.com
also blocks
subdomain.example.com
, subdomain.subdomain.example.com
, and so on).
Here’s how to use domain-sift
with Unbound and RPZ on OpenBSD.
$ domain-sift -f rpz /path/to/blocklist_source > blocklist
unbound.conf
:rpz:
name: rpz.home.arpa
zonefile: /var/unbound/etc/rpz-block.zone
#rpz-log: yes
rpz-signal-nxdomain-ra: yes
NOTE: rpz.home.arpa
serves as an example. The name entry may be
different in your case. In a local area network (LAN) where Unbound
runs on the gateway/router, make sure that a local-data
entry exists
somewhere so that the name you chose resolves. Something like this
should work:
local-data: "rpz.home.arpa. IN A x.x.x.x"
You’ll need to replace x.x.x.x
with the machine’s actual IP address.
/var/unbound/etc/rpz-block.zone
:$ORIGIN rpz.home.arpa.
$INCLUDE /var/unbound/etc/blocklist
blocklist
to the correct location:# mv /path/to/blocklist /var/unbound/etc/blocklist
# rcctl restart unbound
domain-sift
only deals with extracting domains from text files and
formatting them. It doesn’t fetch blocklists or provide them.
The design explicitly includes this limitation for a few reasons:
It follows the Unix philosophy: do one thing well; read from a file or STDIN; print to STDOUT.
It lets domain-sift
use minimal
pledge(2)
promises through
OpenBSD::Pledge(3p)
.
The simple design makes it much more flexible and portable.
Here’s roughly what I use to fetch blocklists:
$ grep -Ev '^#' blocklist_urls | xargs -- ftp -o - | domain-sift > blocklist
You can find blocklist sources in many places, such as firebog.net.
If you’ve pulled in a lot of domains, Unbound may fail to start on OpenBSD because it doesn’t have enough time to process them all. You can fix this by increasing Unbound’s timeout value.
$ rcctl get unbound timeout
30
# rcctl set unbound timeout 120
$ rcctl get unbound timeout
120
This software is Copyright © 2023 by Ashlen.
This software uses the ISC License (Internet Systems Consortium
License). For more details, see the LICENSE
file in the project root.