Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Protect E-mail address from scraping on a static site generated by Gatsby

I have a static website that was written in Gatsby. There is an E-mail address on the website, which I want to protect from harvester bots.

My first approach was, that I send the E-mail address to the client-side using GraphQL. The sent data is encoded in base64 and I decode it on client-side in the React component where the E-mail address is displayed. But if I build the Gatsby site in production and take a look at the served index.html I can see the already decoded E-mail address in the html code. In production there seems to be no XHR request at all, so all GraphQL queries were evaluated while the server-side rendering was running.

So for the second approach, I tried to decode the E-mail address when the react component is mount. This way the server-side rendered html page does not contain the E-mail address. But when the page is loaded it is displayed.

The relevant parts of the code look following:

import React, { useState, useEffect } from "react"
import { useStaticQuery, graphql } from "gatsby"

const Contacts = () => {
    const { site } = useStaticQuery(
        graphql`
          query {
            site {
                siteMetadata {
                    email
              }
            }
          }
        `
    )
    function decode(s) {
        var e = {}, i, b = 0, c, x, l = 0, a, r = '', w = String.fromCharCode, L = s.length;
        var A = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
        for (i = 0; i < 64; i++) { e[A.charAt(i)] = i; }
        for (x = 0; x < L; x++) {
            c = e[s.charAt(x)]; b = (b << 6) + c; l += 6;
            while (l >= 8) { ((a = (b >>> (l -= 8)) & 0xff) || (x < (L - 2))) && (r += w(a)); }
        }
        return r;
    };

    const [email, setEmail] = useState("");
    useEffect(() => decodeData(), []);
    function decodeData() {
        setEmail(() => decode(site.siteMetadata.email), []);
    }

    return (
        //...
        <span className="service-text">{email}</span>
        //...
    )
}


export default Contacts

Does this approach make any difference? I mean can I protect this way the E-mail address from the bots? This way at least the requested html page does not contain the E-mail address hard coded.

If you would like to take a look at the page in the developer tools of a browser, it can be found here: https://www.barbaraapartmanheviz.hu/en/

like image 290
Milan Tenk Avatar asked Nov 30 '25 08:11

Milan Tenk


1 Answers

That should work. useEffect is not executed on the server side so the email won't be decoded before it's sent to the client.

It seems a bit needlessly complicated maybe. I'd say just put {typeof window !== 'undefined' && decode(site.siteMetadata.email)} in your JSX.

Of course there is no such thing as 100% protection. It's quite possible Google will index this email address. They do execute JavaScript during indexing. I'd strongly suspect most scrapers do not, but there might be some that do.

like image 188
ehrencrona Avatar answered Dec 01 '25 20:12

ehrencrona