Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to solve system of over a 100 million linear equations with 30 unknowns?

    Hi,

    I have about a 100 million equations (observation rows with known and unknown variables after applying weights; the observations are about 80,000 without applying weights) and about 30 unknowns. Ho do I solve the system of these equations? The equation which I wish to solve is of the following form:

    Yi = ∑ (AijXij)

    where i= 1 to 100 million

    j= 1 to 30
    Yi and Aij are known (variables) in my database
    I need to find values of Xij's

    I have found multiple sets of values for the 30 unknowns Xij and none of the value-sets which I have found satisfies the other equations (observation rows). I want a single solution of Xij with acceptable confidence interval.
    Also, how to find the values of higher orders in case the observation do not follow the linear first order equations?

    Any help would be sincerely appreciated since I am not good at STATA programming and have limited knowledge about solving such equations using STATA.

    Regards,
    Amit

  • #2
    If you are looking for an exact solution to these equations, more likely than not, none exists. It is basic linear algebra that if the number of equations is less than the number of unknowns, then the system has infinitely many solutions; if the number of equations equals the number of unknowns and the equations are linearly independent (i.e. the matrix Aij is non-singular), then there is a unique solution; if the number of equations exceeds the number of unknowns, then there may or may not be a solution--but if there is one, then the equations are, in fact, not independent. So unless the data you are working with are highly redundant, so that the 80,000 equations actually are just simple linear transforms of 30 equations, there will be no solution.

    Perhaps what you are looking for is a solution to a system of equations like
    Code:
    Yi = ∑ (AijXij) + ei 
    where ei are error terms with a normal distribution having mean 0
    and you want a solution that minimizes the ∑ei2. That process would be known as ordinary least squares linear regression. The -regress- command will do that for you. (and it will deal with weights as well.) Since you do not say anything about the organization of your data set, it is not possible to give you specific code, but it sounds like you need to read-up on the -regress- command first anyway, and after seeing the examples in the manual, you will probably be able to see how it applies to your data set.


    Comment


    • #3
      Thank you, Dr. Schechter. I was trying to find an exact solution only. However, I will go through the OLS examples and try following what is suggested by you.

      Thanks for your time.

      Regards,
      Amit

      Comment


      • #4
        If, as it happens, this system of equations does have an exact solution, the regression procedure will produce it for you.

        Comment


        • #5
          I hope it does with a significantly high R-squared value. Is there any algorithm to find a subset of equations which may have an exact solution?

          Regards

          Comment


          • #6
            If there is an exact solution, R2 should be 1.0. Due to rounding error it might be a tiny bit less than that.

            My guess is that some algorithm to construct a maximal subset of consistent equations (i.e. having an exact solution) exists, but I don't know of one. Sorry I can't help you with that. But I would be wary of that approach because such a subset is not likely to be unique. In fact, with N unknowns, any subset of exactly N equations will have at least one exact solution. My intuition is that if there are enough redundant equations in your corpus, any of those subsets could be extended to a maximal consistent subset. So there could be several such subsets and it isn't clear how you would decide which one is most suitable for your purposes.

            Comment

            Working...
            X