How can I extract the column names (or row and column index) of duplicate element in next data frame?
V1 V2 V3 V4
PC1 0.5863431 0.5863431 3.952237e-01 3.952237e-01
PC2 -0.3952237 -0.3952237 5.863431e-01 5.863431e-01
PC3 -0.7071068 0.7071068 1.665335e-16 3.885781e-16
For example 0.5863431
is equal to 0.5863431
, so "V1"
and "V2"
are the column names.
In that dataframe I want to get:
[1] "V1" "V2" "V3" "V4"
As you can see, looking rather only the result of the first row.
Second example:
V1 V2 V3 V4
PC1 -0.5987139 -0.5987139 -0.03790446 0.5307039
PC2 -0.0189601 -0.0189601 -0.99315168 -0.1137136
PC3 0.3986891 0.3523926 -0.11045319 0.8394442
Result:
[1] "V1" "V2"
yes, you can create a non unique clustered as well non unique NONCLUSTERED index on temporary table.
From ndarray If data is an ndarray, index must be the same length as data. If no index is passed, one will be created having values [0, ..., len(data) - 1] . pandas supports non-unique index values.
To convert the index to non unique: create index temp on mytable (id, 1); drop index myunique; create index mynonunique on mytable (id); drop index temp; Thank you for your answer and your suggestion.
In addition to enforcing the uniqueness of data values, a unique index can also be used to improve data retrieval performance during query processing. Non-unique indexes are not used to enforce constraints on the tables with which they are associated.
There may be a better way, but here's my take on it.
## coerce to matrix (if not already)
m <- as.matrix(df)
## find duplicates across both margins
d <- duplicated(m, MARGIN = 0) | duplicated(m, MARGIN = 0, fromLast = TRUE)
## grab the unique col names
colnames(m)[unique(col(d)[d])]
Examples: On your first data frame -
df1 <- read.table(text = "V1 V2 V3 V4
PC1 0.5863431 0.5863431 3.952237e-01 3.952237e-01
PC2 -0.3952237 -0.3952237 5.863431e-01 5.863431e-01
PC3 -0.7071068 0.7071068 1.665335e-16 3.885781e-16", header = TRUE)
m1 <- as.matrix(df1)
d1 <- duplicated(m1, MARGIN = 0) | duplicated(m1, MARGIN = 0, fromLast = TRUE)
colnames(m1)[unique(col(d1)[d1])]
# [1] "V1" "V2" "V3" "V4"
And on the second -
df2 <- read.table(text = "V1 V2 V3 V4
PC1 -0.5987139 -0.5987139 -0.03790446 0.5307039
PC2 -0.0189601 -0.0189601 -0.99315168 -0.1137136
PC3 0.3986891 0.3523926 -0.11045319 0.8394442", header = TRUE)
m2 <- as.matrix(df2)
d2 <- duplicated(m2, MARGIN = 0) | duplicated(m2, MARGIN = 0, fromLast = TRUE)
colnames(m2)[unique(col(d2)[d2])]
# [1] "V1" "V2"
Side note: Since your data contains all numeric values I would recommend beginning with a matrix instead of a data frame.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With