invalid reference to FROM-clause entry for table "d"

Question

As part of the k-means algorithm, I am trying to update the cluster each item belongs to as in the below query. Problem is, I cannot seem to be able to reference table d inside the nested query.

UPDATE algorithms.km_crimes d SET cluster_id = c.id 
FROM (SELECT id FROM algorithms.km_cluster_centres c 
ORDER BY |/ (POW(d.latitude-c.latitude,2)+POW(d.longitude-c.longitude,2))      
ASC LIMIT 1) AS c
WHERE d.cluster_id IS DISTINCT FROM c.id;

Can anyone suggest how to restructure the query? I've tried too many modifications to count

Nick Barnes · Accepted Answer

Based on the MySQL example which you are converting, you don't need to change this first query at all.

The algorithm doesn't care how many times a cluster_id is reassigned in each iteration; it just needs to stop when none of the cluster centers are moved. And luckily, the second query is much easier to fix.

This appears to work:

CREATE TABLE km_data (id serial, cluster_id int, lat double precision, lng double precision);
CREATE TABLE km_clusters (id serial, lat double precision, lng double precision);

CREATE OR REPLACE FUNCTION kmeans(k int) RETURNS VOID LANGUAGE plpgsql AS $$
BEGIN
  TRUNCATE km_clusters;

  INSERT INTO km_clusters (lat, lng)
  SELECT lat, lng FROM km_data
  ORDER BY random() LIMIT k;

  LOOP
    UPDATE km_data d SET cluster_id = (
      SELECT id FROM km_clusters c 
      ORDER BY |/(POW(d.lat-c.lat,2)+POW(d.lng-c.lng,2)) LIMIT 1
    );

    UPDATE km_clusters c
    SET lat=d.lat, lng=d.lng
    FROM (
      SELECT
        cluster_id, 
        AVG(lat) AS lat,
        AVG(lng) AS lng
      FROM km_data
      GROUP BY cluster_id
    ) d 
    WHERE
      c.id=d.cluster_id AND
      ABS(c.lat-d.lat) < 0.001 AND
      ABS(c.lng-d.lng) < 0.001;

    EXIT WHEN NOT FOUND;
  END LOOP;
END $$;

You can tune the numbers in the final WHERE clause if you want more precision, though this looks like a pretty imprecise algorithm to begin with.

invalid reference to FROM-clause entry for table "d"

Tags:

postgresql

k-means

Ponsietta

1 Answers

Nick Barnes

Recent Activity

Donate For Us

invalid reference to FROM-clause entry for table "d"

Tags:

postgresql

k-means

Ponsietta

1 Answers

Nick Barnes

Related questions

Recent Activity

Donate For Us