Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgres errors on ARM-based M1 Mac w/ Big Sur

Ever since I got a new ARM-based M1 MacBook Pro, I've been experiencing severe and consistent PostgreSQL issues (psql 13.1). Whether I use a Rails server or Foreman, I receive errors in both my browser and terminal like PG::InternalError: ERROR: could not read block 15 in file "base/147456/148555": Bad address or PG::Error (invalid encoding name: unicode) or Error during failsafe response: PG::UnableToSend: no connection to the server. The strange thing is that I can often refresh the browser repeatedly in order to get things to work (until they inevitably don't again).

I'm aware of all the configuration challenges related to ARM-based M1 Macs, which is why I've uninstalled and reinstalled everything from Homebrew to Postgres multiple times in numerous ways (with Rosetta, without Rosetta, using arch -x86_64 brew commands, using the Postgres app instead of the Homebrew install). I've encountered a couple other people on random message boards who are experiencing the same issue (also on new Macs) and not having any luck, which is why I'm reluctant to believe that it's a drive corruption issue. (I've also run the Disk Utility FirstAid check multiple times; it says everything's healthy, but I have no idea how reliable that is.)

I'm using thoughtbot parity to sync up my dev environment database with what's currently in production. When I run development restore production, I get hundreds of lines in my terminal that look like the output below (this is immediately after the download completes but before it goes on to create defaults, process data, sequence sets, etc.). I believe it's at the root of the issue, but I'm not sure what the solution would be:

pg_restore: dropping TABLE [table name1]
pg_restore: from TOC entry 442; 1259 15829269 TABLE [table name1] u1oi0d2o8cha8f
pg_restore: error: could not execute query: ERROR:  table "[table name1]" does not exist
Command was: DROP TABLE "public"."[table name1]";
pg_restore: dropping TABLE [table name2]
pg_restore: from TOC entry 277; 1259 16955 TABLE [table name2] u1oi0d2o8cha8f
pg_restore: error: could not execute query: ERROR:  table "[table name2]" does not exist
Command was: DROP TABLE "public"."[table name2]";
pg_restore: dropping TABLE [table name3]
pg_restore: from TOC entry 463; 1259 15830702 TABLE [table name3] u1oi0d2o8cha8f
pg_restore: error: could not execute query: ERROR:  table "[table name3]" does not exist
Command was: DROP TABLE "public"."[table name3]";
pg_restore: dropping TABLE [table name4]
pg_restore: from TOC entry 445; 1259 15830421 TABLE [table name4] u1oi0d2o8cha8f
pg_restore: error: could not execute query: ERROR:  table "[table name4]" does not exist
Command was: DROP TABLE "public"."[table name4]";

Has anyone else experienced this? Any solution ideas would be much appreciated. Thanks!

EDIT: I was able to reproduce the same issue on an older MacBook Pro (also running Big Sur), so it seems unrelated to M1 but potentially related to Big Sur.

like image 465
Carl Avatar asked Jan 12 '21 16:01

Carl


People also ask

Does M1 Mac use ARM64?

The CPU uses the AArch64 or ARM64 extension set of the ARM architecture. Likewise, you are probably used to integrated GPUs as well, as they've been used in Intel and AMD chips for years. The GPU used in the Apple M1 has eight cores and takes up just a bit more space on the chip than the eight CPU cores.

Is M1 Mac ARM or x86?

Unlike Intel chips built on the x86 architecture, the Apple Silicon M1 uses an Arm-based architecture much like the A-series chips that Apple has been designing for iPhones and iPads for years now.

What port is Postgres running on Mac?

Postgres is known for using port 5432 as a default (this can be changed).

Can you run x86 Ubuntu on M1 Mac?

Rosetta 2 is a translation layer that enables x86 applications to run on macOS, but while it works well for many macOS applications, the M1 is not good for running x86 virtual machines.


2 Answers

UPDATE #2:

WAL Buffer etc. adjustments extended the time between errors, but didn't eliminate it completely. Ended up reinstalling a fresh Apple Silicon version of Postgres using Homebrew then doing a pg_dump of my existing database (experiencing the errors) and restoring it to the new installation/cluster.

Here's the interesting bit: pg_restore failed to restore one of the indexes in the database, and noted it during the restore process (which otherwise completed). My hunch is that corruption or another issue with this index was causing the Bad Address errors. As such, my final suggestion on this issue is to perform pg_dump, then use pg_restore, not pg_dump to restore the database. pg_restore appears to have flagged this issue where pg_dump didn't, writing a clean DB sans the faulty index.

UPDATE:

Continued to experience this issue after attempting several workarounds, including a full pg_dump and restore of the affected database. And while some of the fixes seem to extend the time between occurrences (particularly increasing shared buffer memory), none have proven a permanent fix.

That said, some more digging on postgres mailing lists revealed that this "Bad Address" error can occur in conjunction with WAL (write-ahead-log) issues. As such, I've now set the following in my postgresql.conf file, significantly increasing the WAL buffer size:

wal_buffers = 4MB

and have not experienced the issue since (knock on wood, again).

It makes sense that this would have some effect, as the wal_buffer size increases by default in proportion to the shared buffer size (as aforementioned, increasing shared buffer size provided temporary relief). Anyway, something else to try until we get definitive word on what's causing this bug.


Was having this exact issue sporadically on an M1 MacBook Air: ERROR: could not read block and Bad Address in various permutations.

I read in postgres forum that this issue can occur in virtual machine setups. As such, I assume this is somehow caused by Rosetta. Even if you're using the Universal version of postgres, you're likely still using an x86 binary for some adjunct process (e.g. Python in my case).

Regardless, here's what has solved the issue (so far): reindexing the database

Note: you need to reindex from the command line, not using SQL commands. When I attempted to reindex using SQL, I encountered the same Bad Address error over and over, and the reindexing never completed.

When I reindexed using the command line, the process finished, and the Bad Address error has not recurred (knock on wood).

For me, it was just:

reindexdb name_of_database

Took 20-30 minutes for a 12GB DB. Not only am I not getting these errors anymore, but the database seems snappier to boot. Only hope the issue doesn't return with repeated reads/writes/index creation in Rosetta. I'm not sure why this works... maybe indices created on M1 Macs are prone to corruption? Maybe the indices become corrupt due to write or access because of the Rosetta interaction?

like image 132
Ben Wilson Avatar answered Oct 16 '22 16:10

Ben Wilson


Is it possible that something in the Big Sur Beta 11.3 fixed this issue?

I've been having the same issues as OP since installing PostgreSQL 13 using MacPorts on my Mac mini M1 (now on PostgreSQL 13.2).

I would see could not read block errors:

  1. Occasionally when running ad hoc queries
  2. Always when compiling a book in R Markdown that makes several queries
  3. Always when running VACUUM FULL on my main database (there's about 620 GB in the instance on this machine and the error would be thrown very quickly relative to how long a VACUUM FULL would take).

(My "fix" so far has been to point my Mac to the Ubuntu server I have running in the corner of my office, so no real problem for me.)

But I've managed to do 2 and 3 without the error since upgrading to Big Sur Beta 11.3 today (both failed immediately prior to upgrading). Is it possible that something in the OS fixed this issue?

like image 2
Ian Gow Avatar answered Oct 16 '22 16:10

Ian Gow