Here we are explaining how to write an R program to compare two data frames to find the row in the first data frame but not the second data frame. Here we are using a built-in function data.frame(). A data frame is used for storing data tables which has a list of vectors with equal length. The function setdiff() helps to calculate the (nonsymmetric) set difference of subsets of a probability space. The syntax of this function is,
– where x, y vectors, data frames, or ps objects containing a sequence of items.And dots(…) indicates the arguments to be passed to or from other methods.
Below are the steps used in the R program to compare two data frames to find the row in the first data frame but not the second data frame. In this R program, we directly give the data frame to a built-in function. Here we are using variables DF1, DF2 for holding different data frames. Call the function data.frame() for creating data frame. Finally, compare the two data frames by calling the function setdiff() like setdiff(DF1,DF2).
STEP 1: Assign variables DF1,DF2 with data frames STEP 2: First print original data frames STEP 3: Compare the two data frames by calling like setdiff(DF1,DF2) STEP 4: Print the final data frame
DF1 = data.frame(
"item" = c("item1", "item2", "item3"),
"Jan" = c(12, 14, 12),
"Feb" = c(11, 12, 15),
"Mar" = c(12, 14, 15)
)
DF2 = data.frame(
"item" = c("item1", "item2", "item3"),
"Jan" = c(12, 14, 12),
"Feb" = c(11, 12, 15),
"Mar" = c(12, 15, 18)
)
print("Original Dataframes:")
## [1] "Original Dataframes:"
print(DF1)
## item Jan Feb Mar
## 1 item1 12 11 12
## 2 item2 14 12 14
## 3 item3 12 15 15
print(DF2)
## item Jan Feb Mar
## 1 item1 12 11 12
## 2 item2 14 12 15
## 3 item3 12 15 18
print("Row(s) in first data frame that are not present in second data frame:")
## [1] "Row(s) in first data frame that are not present in second data frame:"
print(setdiff(DF1,DF2))
## Mar
## 1 12
## 2 14
## 3 15