Monday, 9 September 2013

R issue: How to introduce tags from dataframe to specific tag documents inside a corpus

R issue: How to introduce tags from dataframe to specific tag documents
inside a corpus

Is it possible to add an additional tag to a corpus document which has
another tag with a specific value?
I have a corpus with 20 docs. The docs have a character ID set. On the
other hand I have a dataframe with some of the 20 docs. Beside the ID the
dataframe has a tag column. This tags have to be set to all corresponding
docs inside the corpus. How would you find the right doc in the corpus for
all elements inside the dataframe and set the corresponding tag?
Example: library("tm") data(crude) # a corpus with 20 docs meta(crude,
type="local", tag="alphanumID") <- letters[1:20] # adding some additional
IDs to the corpus
mydf <- data.frame(cbind(SomeTag=sample.int(1e8, size = 15, replace =
FALSE, prob = NULL),
alphanumID=paste(letters[1:15], 16:30, sep="")))
# Creating a dataframe with similar IDs
mydf <- mydf[sample(nrow(mydf)),] # permutation of elements (rows)
rownames(mydf) <- 1:15 # overwriting the rownames
mydf
> mydf
SomeTag alphanumID
1 42381006 e20
2 69493835 l27
3 26159062 d19
4 44461050 n29
5 83040006 c18
6 30317343 g22
7 33704495 a16
8 64416661 k26
9 56907735 m28
10 96524612 f21
11 23311778 h23
12 81621392 o30
13 41966863 b17
14 94132818 j25
15 46272404 i24
## This is the ISSUE
## How to introduce the "SomeTag"-values to the corespondent alphanumID
docs inside the corpus?
for (i in mydf){
meta(crude[at the position crude$alphanumID==mydf2$alphanumID[i]],
tag="SomeTag", type=local) <- mydf2$SomeTag[i]
}

No comments:

Post a Comment