Today, large language models like GPT-4 are not well suited to solve math problems. This is mainly due to the way these models work under the hood. Using a technique called next token prediction, the model looks and input it was given (in this case a math problem) and makes an approximate guess based on data it was trained on. This tends to work very well for more creative tasks like writing, but with math, where there is a definite answer, this approach is less effective.
With the introduction of the OpenAI Assistants API, you can now rely on more robust tools like Code Interpreter to solve these problems for you. With Code Interpreter, the model will write and run the code necessary to solve a specific math or computational problem, and then output the answer to you. This makes math domain problems much more likely to be solved by our models right out of the box.