split join data.table R
Objective
Join DT1
(as i
in data.table
) to DT2
given key(s) column(s), within each group of DT2
specified by the Date
column.
I cannot run DT2[DT1, on = 'key']
as that would be incorrect since key
column is repeated across the Date
column, but unique within a single date.
Reproducible example with a working solution
DT3
is my expected output. Is there any way to achieve this without the split
manoeuvre, which does not feel very data.table
-y?
library(data.table)
set.seed(1)
DT1 <- data.table(
Segment = sample(paste0('S', 1:10), 100, TRUE),
Activity = sample(paste0('A', 1:5), 100, TRUE),
Value = runif(100)
)
dates <- seq(as.Date('2018-01-01'), as.Date('2018-11-30'), by = '1 day')
DT2 <- data.table(
Date = rep(dates, each = 5),
Segment = sample(paste0('S', 1:10), 3340, TRUE),
Total = runif(3340, 1, 2)
)
rm(dates)
# To ensure that each Date Segment combination is unique
DT2 <- unique(DT2, by = c('Date', 'Segment'))
iDT2 <- split(DT2, by = 'Date')
iDT2 <- lapply(
iDT2,
function(x) {
x[DT1, on = 'Segment', nomatch = 0]
}
)
DT3 <- rbindlist(iDT2, use.names = TRUE)
r data.table
add a comment |
Objective
Join DT1
(as i
in data.table
) to DT2
given key(s) column(s), within each group of DT2
specified by the Date
column.
I cannot run DT2[DT1, on = 'key']
as that would be incorrect since key
column is repeated across the Date
column, but unique within a single date.
Reproducible example with a working solution
DT3
is my expected output. Is there any way to achieve this without the split
manoeuvre, which does not feel very data.table
-y?
library(data.table)
set.seed(1)
DT1 <- data.table(
Segment = sample(paste0('S', 1:10), 100, TRUE),
Activity = sample(paste0('A', 1:5), 100, TRUE),
Value = runif(100)
)
dates <- seq(as.Date('2018-01-01'), as.Date('2018-11-30'), by = '1 day')
DT2 <- data.table(
Date = rep(dates, each = 5),
Segment = sample(paste0('S', 1:10), 3340, TRUE),
Total = runif(3340, 1, 2)
)
rm(dates)
# To ensure that each Date Segment combination is unique
DT2 <- unique(DT2, by = c('Date', 'Segment'))
iDT2 <- split(DT2, by = 'Date')
iDT2 <- lapply(
iDT2,
function(x) {
x[DT1, on = 'Segment', nomatch = 0]
}
)
DT3 <- rbindlist(iDT2, use.names = TRUE)
r data.table
What about usingmerge
function with keys'Date'
and'Segment'
?
– Heikki
Nov 12 '18 at 20:59
Date
does not exist inDT1
. Hence, can't be used to merge
– Ameya
Nov 12 '18 at 21:03
Sorry, I should have written merge by'Segment'
(only).
– Heikki
Nov 12 '18 at 21:40
add a comment |
Objective
Join DT1
(as i
in data.table
) to DT2
given key(s) column(s), within each group of DT2
specified by the Date
column.
I cannot run DT2[DT1, on = 'key']
as that would be incorrect since key
column is repeated across the Date
column, but unique within a single date.
Reproducible example with a working solution
DT3
is my expected output. Is there any way to achieve this without the split
manoeuvre, which does not feel very data.table
-y?
library(data.table)
set.seed(1)
DT1 <- data.table(
Segment = sample(paste0('S', 1:10), 100, TRUE),
Activity = sample(paste0('A', 1:5), 100, TRUE),
Value = runif(100)
)
dates <- seq(as.Date('2018-01-01'), as.Date('2018-11-30'), by = '1 day')
DT2 <- data.table(
Date = rep(dates, each = 5),
Segment = sample(paste0('S', 1:10), 3340, TRUE),
Total = runif(3340, 1, 2)
)
rm(dates)
# To ensure that each Date Segment combination is unique
DT2 <- unique(DT2, by = c('Date', 'Segment'))
iDT2 <- split(DT2, by = 'Date')
iDT2 <- lapply(
iDT2,
function(x) {
x[DT1, on = 'Segment', nomatch = 0]
}
)
DT3 <- rbindlist(iDT2, use.names = TRUE)
r data.table
Objective
Join DT1
(as i
in data.table
) to DT2
given key(s) column(s), within each group of DT2
specified by the Date
column.
I cannot run DT2[DT1, on = 'key']
as that would be incorrect since key
column is repeated across the Date
column, but unique within a single date.
Reproducible example with a working solution
DT3
is my expected output. Is there any way to achieve this without the split
manoeuvre, which does not feel very data.table
-y?
library(data.table)
set.seed(1)
DT1 <- data.table(
Segment = sample(paste0('S', 1:10), 100, TRUE),
Activity = sample(paste0('A', 1:5), 100, TRUE),
Value = runif(100)
)
dates <- seq(as.Date('2018-01-01'), as.Date('2018-11-30'), by = '1 day')
DT2 <- data.table(
Date = rep(dates, each = 5),
Segment = sample(paste0('S', 1:10), 3340, TRUE),
Total = runif(3340, 1, 2)
)
rm(dates)
# To ensure that each Date Segment combination is unique
DT2 <- unique(DT2, by = c('Date', 'Segment'))
iDT2 <- split(DT2, by = 'Date')
iDT2 <- lapply(
iDT2,
function(x) {
x[DT1, on = 'Segment', nomatch = 0]
}
)
DT3 <- rbindlist(iDT2, use.names = TRUE)
r data.table
r data.table
asked Nov 12 '18 at 20:49
AmeyaAmeya
1,0291819
1,0291819
What about usingmerge
function with keys'Date'
and'Segment'
?
– Heikki
Nov 12 '18 at 20:59
Date
does not exist inDT1
. Hence, can't be used to merge
– Ameya
Nov 12 '18 at 21:03
Sorry, I should have written merge by'Segment'
(only).
– Heikki
Nov 12 '18 at 21:40
add a comment |
What about usingmerge
function with keys'Date'
and'Segment'
?
– Heikki
Nov 12 '18 at 20:59
Date
does not exist inDT1
. Hence, can't be used to merge
– Ameya
Nov 12 '18 at 21:03
Sorry, I should have written merge by'Segment'
(only).
– Heikki
Nov 12 '18 at 21:40
What about using
merge
function with keys 'Date'
and 'Segment'
?– Heikki
Nov 12 '18 at 20:59
What about using
merge
function with keys 'Date'
and 'Segment'
?– Heikki
Nov 12 '18 at 20:59
Date
does not exist in DT1
. Hence, can't be used to merge– Ameya
Nov 12 '18 at 21:03
Date
does not exist in DT1
. Hence, can't be used to merge– Ameya
Nov 12 '18 at 21:03
Sorry, I should have written merge by
'Segment'
(only).– Heikki
Nov 12 '18 at 21:40
Sorry, I should have written merge by
'Segment'
(only).– Heikki
Nov 12 '18 at 21:40
add a comment |
1 Answer
1
active
oldest
votes
You can achieve the same result with a cartesian merge
:
DT4 <- merge(DT2,DT1,by='Segment',allow.cartesian = TRUE)
Here is the proof:
> all(DT3[order(Segment,Date,Total,Activity,Value),
c('Segment','Date','Total','Activity','Value')] ==
DT4[order(Segment,Date,Total,Activity,Value),
c('Segment','Date','Total','Activity','Value')])
[1] TRUE
1
Thanks,allow.cartesian
does it.
– Ameya
Nov 12 '18 at 22:10
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53269871%2fsplit-join-data-table-r%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can achieve the same result with a cartesian merge
:
DT4 <- merge(DT2,DT1,by='Segment',allow.cartesian = TRUE)
Here is the proof:
> all(DT3[order(Segment,Date,Total,Activity,Value),
c('Segment','Date','Total','Activity','Value')] ==
DT4[order(Segment,Date,Total,Activity,Value),
c('Segment','Date','Total','Activity','Value')])
[1] TRUE
1
Thanks,allow.cartesian
does it.
– Ameya
Nov 12 '18 at 22:10
add a comment |
You can achieve the same result with a cartesian merge
:
DT4 <- merge(DT2,DT1,by='Segment',allow.cartesian = TRUE)
Here is the proof:
> all(DT3[order(Segment,Date,Total,Activity,Value),
c('Segment','Date','Total','Activity','Value')] ==
DT4[order(Segment,Date,Total,Activity,Value),
c('Segment','Date','Total','Activity','Value')])
[1] TRUE
1
Thanks,allow.cartesian
does it.
– Ameya
Nov 12 '18 at 22:10
add a comment |
You can achieve the same result with a cartesian merge
:
DT4 <- merge(DT2,DT1,by='Segment',allow.cartesian = TRUE)
Here is the proof:
> all(DT3[order(Segment,Date,Total,Activity,Value),
c('Segment','Date','Total','Activity','Value')] ==
DT4[order(Segment,Date,Total,Activity,Value),
c('Segment','Date','Total','Activity','Value')])
[1] TRUE
You can achieve the same result with a cartesian merge
:
DT4 <- merge(DT2,DT1,by='Segment',allow.cartesian = TRUE)
Here is the proof:
> all(DT3[order(Segment,Date,Total,Activity,Value),
c('Segment','Date','Total','Activity','Value')] ==
DT4[order(Segment,Date,Total,Activity,Value),
c('Segment','Date','Total','Activity','Value')])
[1] TRUE
answered Nov 12 '18 at 21:39
HeikkiHeikki
1,2471017
1,2471017
1
Thanks,allow.cartesian
does it.
– Ameya
Nov 12 '18 at 22:10
add a comment |
1
Thanks,allow.cartesian
does it.
– Ameya
Nov 12 '18 at 22:10
1
1
Thanks,
allow.cartesian
does it.– Ameya
Nov 12 '18 at 22:10
Thanks,
allow.cartesian
does it.– Ameya
Nov 12 '18 at 22:10
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53269871%2fsplit-join-data-table-r%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
What about using
merge
function with keys'Date'
and'Segment'
?– Heikki
Nov 12 '18 at 20:59
Date
does not exist inDT1
. Hence, can't be used to merge– Ameya
Nov 12 '18 at 21:03
Sorry, I should have written merge by
'Segment'
(only).– Heikki
Nov 12 '18 at 21:40