Hi,
I am getting following message. When i want to compute simple linear model, y=mx+c. I have 12000 rows.
*** ERROR 10037 *** PROBLEM EXCEEDS AVAILABLE RESOURCES
Please let me know.
thanks,
Suresh
Hi,
I am getting following message. When i want to compute simple linear model, y=mx+c. I have 12000 rows.
*** ERROR 10037 *** PROBLEM EXCEEDS AVAILABLE RESOURCES
Please let me know.
thanks,
Suresh
The error message is telling, isn’t it?
I guess that PHX was not built for such a high number of cases and your are running out of available RAM.
12,000 rows are peanuts for R. Example:
r <- 5e7 # rows
c <- 2 # columns
x <- rnorm(r, mean = 1) # generate random
y <- x + rnorm(r, mean = 0.1) # data (slope 1, intercept 0.1)
data <- matrix(c(x, y), nrow=r, ncol=c,
dimnames=list(rep(NULL, r),
c("x", "y")))
# fit a linear model with intercept
mod <- lm(y ~ x, data=as.data.frame(data))
print(mod)
Takes a couple of seconds on my machine (16 GB RAM) to generate a data set with 50 million (!) rows of random data and fit a linear model.
Call:
lm(formula = y ~ x, data = as.data.frame(data))
Coefficients:
(Intercept) x
0.0977 0.99999
Even in basic R sooner or later you will run out of RAM (the function lm() does not swap to disk)… If you have really large data sets, consider package biglm (https://cran.r-project.org/package=biglm).
It may also be you’ve not set up the problem well, e.g. no initial estimates but saying that you would provide them. Please either post the project here or email it to support.
thanks, Simon.
Hi Simon.
Attached an example with 12,000 rows. Only the WNL model 502 fails: No problems if coded as a PHX-model.
Many cases.phxproj (1.26 MB)
Hi,
I tried with and without initial estimates, still gave me an error. I think this is related to rows only. with more than 3000 rows it doesn’t work.
Thanks,
Suresh
Hi,
Thanks for PHX model. It seems working. Can you send me the same code for PHX build 6.3.0.395. when i opened Manyfiles.phx project it gave an error due to version change.
thanks,
Suresh
Hi,
If i nothing works in PHX, i will try in R.
thanks for the code,
Suresh
test(){
covariate(x)
error(e = 1)
observe(y = a0 + a1 * x + e)
fixef(a0 = c(, 0.1, ))
fixef(a1 = c(, 1, ))
}
thanks,
suresh
Note that the linear model 502 in WNL uses least squares minimization (O)LS, whereas all PHX-models use restricted maximum likelihood (REML).
Without knowing the background of your problem it is difficult to recommend anything. Note that in generating my data I used a variance of 1. For my data set in I got with the PHX-model:
a0 0.0769702
a1 1.01384
stdev0 0.989692
Note that not only the parameters (a0, a1) are estimated but the error as well.
In R:
library(nlme)
mod1 <- lm(y ~ x, data=data) # OLS
mod2 <- lme(y ~ x, random= ~ 1 | x, data=data) # REML by default
summary(mod1)
summary(mod2)
Gives:
Call:
lm(formula = y ~ x, data = data)
Residuals:
Min 1Q Median 3Q Max
-3.9928 -0.6668 0.0048 0.6620 5.3788
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.077016 0.012763 6.034 1.64e-09 ***
x 1.013820 0.009098 111.434 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.9898 on 11998 degrees of freedom
Multiple R-squared: 0.5086, Adjusted R-squared: 0.5086
F-statistic: 1.242e+04 on 1 and 11998 DF, p-value: < 2.2e-16
Linear mixed-effects model fit by REML
Data: data
AIC BIC logLik
33829.03 33858.6 -16910.52
Random effects:
Formula: ~1 | x
(Intercept) Residual
StdDev: 0.05123659 0.9884501
Fixed effects: y ~ x
Value Std.Error DF t-value p-value
(Intercept) 0.076984 0.01276893 10071 6.02901 0
x 1.013845 0.00910216 10071 111.38508 0
Correlation:
(Intr)
x -0.706
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-4.028606196 -0.672923072 0.004855385 0.667628142 5.427081375
Number of Observations: 12000
Number of Groups: 10073
nlme/lme() comes close to the estimates of PHX. Personally for such a simple problem I would prefer (O)LS.