I'm given a .csv data set and I want to establish a function for the data using a for loop. The data set has 5 columns, data consisting of either factors or numerics. If the data is a factor, nothing should be done, just print out the name of the column and it's class. If the data is numeric, then print out the name, class, as well as two functions (that I had already created beforehand).
I'm a bit lost on how to go about organizing the syntax for the function/for loop.
p.data <- read.csv("file.csv")
function(
x){
for.loop.variable <- for(index in data.csv){
if (class(x)) == "factor"
{cat("Name of Column is:", names(x*), "\n",
"Class of Column is :" , (class(x)))
} else {
cat("Name of Column is:", names(x*), "\n",
"Class of Column is :" , (class(x)),"\n",
"Function 1 is :", function.1(x), "\n",
"Function 2 is :", function.2(x), "\n")
}
}
}
return (for.loop.variable)
I think that's the right set up but I have 3 questions that I can't seem to figure out:
1- How does the for loop iteration come into play? I hadn't referenced it in the conditionals at all and I'm not sure how to go about doing so?
2*- How do I go about calling/printing the column name? I don't think it's names(x) but I'm not too sure what it would be other than that.
3- Is the return () correct? Should it return the entire for loop (once I figure out how to tie it into the actual problem) from the variable?
Please let me know where to fix errors so I can properly learn this, please.
Here's an image of the example code so it's a bit easier to read as far as brackets/syntax goes:
To run an if-then statement in R, we use the if() {} function. The function has two main elements, a logical test in the parentheses, and conditional code in curly braces. The code in the curly braces is conditional because it is only evaluated if the logical test contained in the parentheses is TRUE .
Conditional statements with the proper comparison and boolean operators allow the creation of alternate execution paths in the code. Loops allow repeated execution of the same set of statements on all the objects within a sequence. Using an index based for loop is best suited for making changes to items within a list.
Explanation: A conditional Statement is a statement which is used in place of a if-else-if statement. If the condition is true, it will execute expression 1 else it will execute expression 2. A loop on the other hand will do the same instructions again and again until the condition is satisfied.
I assume this question is for educational purposes and therefore present a simple for loop:
class_print <- function(df){
for(i in 1:ncol(df)){
if(is.factor(df[,i])){
print(paste0("Name of column is ", names(df[i]), "class is factor ",collapse = ""))
}
else{
print(paste("Name of column is ", names(df[i]),"class is ", class(df[,i]),collapse=""))
}
}
}
Testing:
class_print(iris)
[1] "Name of column is Sepal.Length class is numeric"
[1] "Name of column is Sepal.Width class is numeric"
[1] "Name of column is Petal.Length class is numeric"
[1] "Name of column is Petal.Width class is numeric"
[1] "Name of column is Speciesclass is factor "
You were missing some parenthesizes that look like they got moved to the wrong spot. I took your code and cleaned it up to help you compare.
Since cat in R prints right to the console, you don't need to necessarily use return, actually when I tested it, return made it so only the last iteration of the loop printed.
p.data <- read.csv("file.csv")
for(i in seq_along(p.data)){
x <- p.data[,i] #pull the individual column for this current iteration
if (class(x) == "factor"){
cat("Name of Column is:", x, "\n",
"Class of Column is :" , (class(x))
)
} else {
cat("Name of Column is:", x, "\n",
"Class of Column is :" , (class(x)),"\n",
"Function 1 is :", function.1(x), "\n",
"Function 2 is :", function.2(x), "\n")
}
}
Doing this in a for loop in R is alright. Depending on who you ask you might get told that for loops in R as very slow. This is kind of true, if you are building a dataframe or a vector without preallocating the object memory it will be very slow.
Another tool you can use is walk from the package purrr but since you asked for a for loop I did it in a for loop, I will make up a walk version and update it
walk(p.data, function(x){
if (class(x) == "factor"){
cat("Name of Column is:", names(x), "\n",
"Class of Column is :" , (class(x)))
} else {
cat("Name of Column is:", names(x), "\n",
"Class of Column is :" , (class(x)),"\n"),
"Function 1 is :", function.1(x), "\n",
"Function 2 is :", function.2(x), "\n")
}
}
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With