Skip to content

Commit bc0ecd4

Browse files
committed
Updated Readme.md
1 parent fb75f57 commit bc0ecd4

File tree

5 files changed

+26
-54
lines changed

5 files changed

+26
-54
lines changed

.gitignore

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -278,8 +278,9 @@ ENV/
278278
env.bak/
279279
venv.bak/
280280

281-
# markdown
282-
**/conversion_content/
281+
# markdown - ignore everything inside conversion_content but keep the folder
282+
**/conversion_content/*
283+
!**/conversion_content/.gitkeep
283284

284285
# Spyder project settings
285286
.spyderproject

conversion2025/README.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,11 @@ Ensure you have the following installed:
2323
- The notebook is designed for scientific documents, but can be extended to other text formats.
2424

2525
## How to use
26-
Place a pdf of your choice into the folder, `/conversion_content`. Name is example.pdf
26+
Place a pdf of your choice into the folder, `/conversion_content`. Name the pdf file as `example.pdf`.
2727
Run the converter in Jupiter. A folder with all the convertion content will be produced.
28-
Right now, a markdown made by Mathpix called `example.md` will be made. To save tokens, Mathpix will not run if `example.md` exists.
29-
28+
for `mathpix_to_llm_to_in2lambda_to_JSON.ipynb`, it will produce a folder called `/mathpix_to_llm_to_in2lambda_to_JSON_out`.
29+
This will contain all the output of the converter.
3030

31+
There is a markdown file called `example.md` inside `/mathpix_to_llm_to_in2lambda_to_JSON_out`, this is the markdown version of the pdf.
32+
As Mathpix rather reliably generates a consistent markdown version of the pdf, the converter will simply start from `example.md`.
33+
Meaning that if you wish to convert a different pdf, you must delete `example.md` first.
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# This file ensures the conversion_content folder is tracked by git
2+
# while ignoring all other contents

conversion2025/mathpix_to_llm_to_in2lambda_to_JSON.ipynb

Lines changed: 15 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -397,8 +397,9 @@
397397
" 1. **Content Extraction:**\n",
398398
" - Identify a suitable `name` for the set of questions.\n",
399399
" - Identify the `year` if mentioned; otherwise, use \"0\".\n",
400-
" - For each question, extract the full question text into `question_content` and the revelant full solution text into `solution_content`.\n",
400+
" - For each question, carefully extract the full question text into `question_content` and the corresponding full solution/answer text into `solution_content`. They may not be in the same section.\n",
401401
" - If no solution is found, leave `solution_content` as an empty string `\"\"`.\n",
402+
" - Preserve all image tags like `![pictureTag](filename.jpg)`, making sure they are placed with their respective \"question_content\" and \"solution_content\".\n",
402403
" - For Each Question extract all image references (e.g., `filename.jpg`) found within the `question_content` and `solution_content` and place them in the `images` list.\n",
403404
"\n",
404405
" 2. **Output Format (Crucial):**\n",
@@ -545,7 +546,10 @@
545546
" 1. **Content Splitting:**\n",
546547
" - From the input `question_content`, identify the main introductory text (the stem) and place it in the `content` field.\n",
547548
" - Identify all sub-questions (e.g., \"(a)\", \"(b)\", \"i.\", \"ii.\") and place their text into the `parts` list.\n",
548-
" - Parts may also be implied, you may also use the solution to infer the parts.\n",
549+
" - Parts may also be implied.\n",
550+
" - All Question Must have at least one part.\n",
551+
" - Ensure that images references are correctly placed with their respective parts.\n",
552+
" - Preserve all content perfectly, including text, LaTeX, and image tags like `![pictureTag](filename.jpg)`.\n",
549553
" - Ensure no solution content is included in the `content` or `parts` fields.\n",
550554
" - The `title` should be a concise summary of the question.\n",
551555
" - The `images` list should be copied exactly from the input.\n",
@@ -566,7 +570,9 @@
566570
"\n",
567571
" 1. **Content Extraction:**\n",
568572
" - From the `full solution`, find the worked solution that corresponds to the given `question part`.\n",
573+
" - Make sure the solutions for all parts together include the entire full solution text, with no missing content.\n",
569574
" - Place this exact text into the `part_solution` field.\n",
575+
" - Ensure that images references are correctly placed with their respective parts.\n",
570576
" - Preserve all content perfectly, including text, LaTeX, and image tags like `![pictureTag](filename.jpg)`.\n",
571577
" - If no specific solution is found, use an empty string `\"\"`.\n",
572578
"\n",
@@ -711,8 +717,8 @@
711717
" You MUST return ONLY a single, raw, valid JSON string that strictly follows the original schema. Do NOT add any explanations, comments, or markdown code blocks.\n",
712718
"\n",
713719
" Apply these correction rules to the content inside the JSON fields:\n",
714-
" 1. **JSON Escaping:** All LaTeX backslashes (`\\`) MUST be escaped as double backslashes (`\\\\`). For example, `\\cup` must be written as `\\\\cup`.\n",
715-
" 2. **Math Delimiters:** All mathematical content must be enclosed in `$...$` for inline math or `$$...$$` for display math. Ensure all delimiters are correctly balanced and closed. '$' and '$$' should not be used for any other purpose.\n",
720+
" 1. **JSON Escaping:** All LaTeX backslashes (`\\`) MUST be escaped as double backslashes (`\\\\`). For example, `\\cup` must be written as `\\\\cup`. Never escape backslashes for newlines (`\\n`), as they should remain as is.\n",
721+
" 2. **Math Delimiters:** All mathematical content must be enclosed in `$...$` for inline math or `$$...$$` for display math. Ensure all delimiters are correctly balanced and closed. '$' and '$$' should not be used for any other purpose. Move all `\\n` outside the math delimiters.\n",
716722
" 3. **Display Math:** `$$` delimiters must be on their own separate lines.\n",
717723
" 4. **Image Tags:** Preserve image tags like `![pictureTag](filename.jpg)` exactly as they are.\n",
718724
" 5. **Content Integrity:** Do not change, paraphrase, or summarize any text, formulas, or image links. Only fix formatting errors according to these rules.\n",
@@ -931,11 +937,12 @@
931937
" print(json.dumps(extracted_dict, indent=2))\n",
932938
" print(\"Now validating the content...\")\n",
933939
"\n",
934-
" content_validated_dict = content_texdown_check(extracted_dict)\n",
935-
" print(\"successfully validated the content.\")\n",
936-
" print(\"successfully converted markdown to JSON.\")\n",
940+
" # content_validated_dict = content_texdown_check(extracted_dict)\n",
941+
" # print(\"successfully validated the content.\")\n",
942+
" # print(json.dumps(content_validated_dict, indent=2))\n",
943+
" # print(\"successfully converted markdown to JSON.\")\n",
937944
" \n",
938-
" return content_validated_dict"
945+
" return extracted_dict"
939946
]
940947
},
941948
{

conversion2025/testing.json

Lines changed: 0 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -1,41 +0,0 @@
1-
{
2-
"name": "Mathematical Analysis (1st Year) - Problem Sheet 1",
3-
"year": "0",
4-
"questions": [
5-
{
6-
"question_content": "1. To gain some familiarity with sets and operations on them, prove the following relations algebraically, and where possible draw the corresponding Venn diagrams.\n(a) $A \\cup \\emptyset=A$\n(b) $A \\cap \\emptyset=\\emptyset$\n(c) $A \\cup B=B \\cup A$\n(d) $A \\cap B=B \\cap A$\n(e) $A \\subseteq A \\cup B$\n(f) $A \\cup(B \\cap A)=A$\n(g) $A \\cap(B \\cup A)=A$\n(h) $(A \\backslash C) \\cap(B \\backslash C)=(A \\cap B) \\backslash C$\n(i) $(A \\cap B)^{\\prime}=A^{\\prime} \\cup B^{\\prime}$",
7-
"solution_content": "1. (a) $A \\cup \\phi=A$ \n(i) $x \\in A \\cup \\phi \\Rightarrow x \\in A$ or $x \\in \\varnothing$ so $x \\in A$ $\\Rightarrow A \\cup \\Phi \\subseteq A$. \n(ii) $x \\in A \\Rightarrow x \\in A$ or $x \\in \\phi$ (although $x \\notin \\phi$ ) $\\Rightarrow x \\in A \\cup \\varnothing$ so $A \\subseteq A \\cup \\phi$ hence $A=A \\cup \\phi$. \n(b) $A \\cap \\phi=\\phi$. \nLHS means $x \\in A$ and $x \\in \\phi$, but that are no $x \\in \\varnothing$ so the statement is vacuous and therefore $A \\cap \\varnothing=\\varnothing$. \n(c) $A \\cup B=\\{x \\mid x \\in A$ or $x \\in B\\}$ \n$$\n=\\{x \\mid x \\in B \\text { or } x \\in A\\}=B \\cup A\n$$\n(d) $A \\cap B=\\{x \\mid x \\in A$ and $x \\in B\\}=B \\cap A$. \n(e) $A \\subseteq A \\cup B$ $x \\in A \\Rightarrow x \\in A$ or $B$, ie $x \\in A \\cup B$ $\\Rightarrow A \\subseteq A \\cup B$. \n(f) $A \\cup(B \\cap A)=A$ $x \\in A \\cup(B \\cap A)$ means $x \\in A$ or $x \\in(B \\cap A)$ $x \\in B$ and $x \\in A$. \n(g) $A \\cap(B \\cup A)=A$ $x \\in A \\cap(B \\cup A)$ means $x \\in A$ and $(x \\in B$ or $x \\in A)$. \n(h) $(A \\backslash C) \\cap(B \\backslash C)=(A \\cap B) \\backslash C$ $x \\in(A \\backslash C) \\cap(B \\backslash C)$ $\\Rightarrow x \\in(A \\backslash C)$ and $x \\in(B \\backslash C)\\Rightarrow x \\in A, x \\notin C$ and $x \\in B, x \\notin C\\Rightarrow x \\in A$ and $x \\in B$ and $x \\notin C\\Rightarrow x \\in A \\cap B, x \\notin C\\Rightarrow x \\in A \\cap B \\backslash C$. \n(i) $(A \\cap B)^{\\prime}=A^{\\prime} \\cup B^{\\prime}$ $x^{\\notin(A \\text { and } B)}$. \n$$\n\\begin{aligned}\n(A \\cap B)^{\\prime}= & \\{x \\mid x \\notin(A \\cap B)\\} \\\\\n& \\Rightarrow x \\notin A \\text { or } x \\notin B \\\\\n& \\Rightarrow x \\in A^{\\prime} \\text { or } x \\in B^{\\prime} \\\\\n& =A^{\\prime} \\cup B^{\\prime}\n\\end{aligned}\n$$\n2. (a) $A \\cap B=\\{1,2,3\\} \\cap\\{1,2\\}=\\{1,2\\}=B$. \n(b) $A \\cup B=\\{1,2,3\\} \\cup\\{1,2\\}=\\{1,2,3\\}=A$. \n(c) $A \\cap(B \\cap C)=A \\cap\\{1\\}=\\{1\\}=E$. \n(d) $(C \\cup A) \\cap B=A \\cap B=B$. \n(e) $A \\backslash B=\\{3\\}=G$. \n(f) $C \\backslash A=\\phi=H$. \n(g) $(D \\backslash F) \\cup(F \\backslash D)=\\{3\\} \\cup \\phi=\\{3\\}=G$. \n(h) $G \\backslash A=\\phi=H$. \n(i) $A \\cup((B \\backslash C) \\backslash F)=A \\cup(\\{2\\} \\backslash F)\\Rightarrow A \\cup \\phi=A=\\text{(k) } H \\cup H=H$. \n3. $x_{1}, x_{2}, \\ldots x_{n+1}$ \nLet $\\frac{x_{1}}{n}=p_{1}+F_{1} \\cdot \\frac{x_{i}}{n}=p_{i}+F_{i}$ etc. $i=1, n+1$. \n$p_{i} \\in \\mathbb{N}, F_{i}$ is the fractional remainders and clearly $F_{i}$ takes one of the values $F_{c}=0 / n, 1 / n, \\cdots \\text { or } n-1 / n$ \nHence there are $n$ usable distinct values for $(n+1) F_{i} \\Rightarrow 2$ of the $F_{i}$ must be equal, $F^{\\prime}$ and $F^{\\prime \\prime}$ say. \n4. Let $f: A \\rightarrow B, g: B \\rightarrow C$. \n(a) If $f, g$ are surjective. If $c \\in C$, then $\\exists \\quad b \\in B$ such that $g(b)=c$. Also $f$ is surjective so $\\exists a \\in A$ such that $f(a)=b$. \nSo $(g \\circ f)(a)=g(f(a))=g(b)=c$ hence $g \\circ f$ is SURJECTIVE. \n(b) If $f, g$ are injective. $a, a^{\\prime} \\in A$ and $y \\quad a \\neq a^{\\prime}$, then $f(a) \\neq f(a^{\\prime})$, since $f$ injective. Since $g$ is also injective, $g(f(a)) \\neq g(f(a^{\\prime}))$, so $g \\circ f$ is injective. \n5. (i) \n$$\n\\begin{aligned}\n& y=(f \\circ g)^{-1}(x) \\\\\n\\Rightarrow \\quad & x =(f \\circ g)(y) \\\\\n& =f(g(y)) \\\\\n\\Rightarrow \\quad & f^{-1}(x)=g(y) \\\\\n\\Rightarrow \\quad & g^{-1}(f^{-1}(x))=y\n\\end{aligned}\n$$\nComparing (1) and (2) $\\Rightarrow (f \\circ g)^{-1}=g^{-1} \\circ f^{-1}$. \n(ii) $(f \\circ(g \\circ h))(x)$ \n$$\n\\begin{aligned}\n& =f((g \\circ h)(x)) \\\\\n& =f(g(h(x)) \\\\\n& =(f \\circ g)(h(x)) \\\\\n& =((f \\circ g) \\circ h)(x)\n\\end{aligned}\n$$\nso $f \\circ(g \\circ h)=(f \\circ g) \\circ h$.",
8-
"images": [
9-
"0_2025_07_14_0dc1d2fe9dc3cfc99f10g-03.jpg",
10-
"1_2025_07_14_0dc1d2fe9dc3cfc99f10g-04.jpg",
11-
"2_2025_07_14_0dc1d2fe9dc3cfc99f10g-04.jpg",
12-
"3_2025_07_14_0dc1d2fe9dc3cfc99f10g-04.jpg",
13-
"4_2025_07_14_0dc1d2fe9dc3cfc99f10g-05.jpg",
14-
"5_2025_07_14_0dc1d2fe9dc3cfc99f10g-05.jpg",
15-
"6_2025_07_14_0dc1d2fe9dc3cfc99f10g-06.jpg",
16-
"7_2025_07_14_0dc1d2fe9dc3cfc99f10g-09.jpg",
17-
"8_2025_07_14_0dc1d2fe9dc3cfc99f10g-09.jpg"
18-
]
19-
},
20-
{
21-
"question_content": "2. Let $A=\\{1,2,3\\}, B=\\{1,2\\}, C=\\{1,3\\}, D=\\{2,3\\}, E=\\{1\\}, F=\\{2\\}, G=\\{3\\}$, $H=\\emptyset$. Simplify the following expressions. In each case the answer should be one of the sets $A, B \\cdots H$.\n(a) $A \\cap B$\n(b) $A \\cup B$\n(c) $A \\cap(B \\cap C)$\n(d) $(C \\cup A) \\cap B$\n(e) $A \\backslash B$\n(f) $C \\backslash A$\n(g) $(D \\backslash F) \\cup(F \\backslash D)$\n(h) $G \\backslash A$\n(i) $A \\cup((B \\backslash C) \\backslash F)$\n(j) $H \\cup H$\n(k) $A \\cap A$\n(l) $((B \\cup C) \\cap C) \\cup H$",
22-
"solution_content": "2. (a) $A \\cap B=\\{1,2,3\\} \\cap\\{1,2\\}=\\{1,2\\}=B$. \n(b) $A \\cup B=\\{1,2,3\\} \\cup\\{1,2\\}=\\{1,2,3\\}=A$. \n(c) $A \\cap(B \\cap C)=A \\cap\\{1\\}=\\{1\\}=E$. \n(d) $(C \\cup A) \\cap B=A \\cap B=B$. \n(e) $A \\backslash B=\\{3\\}=G$. \n(f) $C \\backslash A=\\phi=H$. \n(g) $(D \\backslash F) \\cup(F \\backslash D)=\\{3\\} \\cup \\phi=\\{3\\}=G$. \n(h) $G \\backslash A=\\phi=H$. \n(i) $A \\cup((B \\backslash C) \\backslash F)=A \\cup(\\{2\\} \\backslash F)=A \\cup\\phi=A=\\text{(k) } H \\cup H=H$.",
23-
"images": []
24-
},
25-
{
26-
"question_content": "3. Use the pigeonhole principle to prove that in any set of $n+1$ integers, there must be two integers whose difference is divisible by $n$.",
27-
"solution_content": "3. $x_{1}, x_{2}, \\ldots x_{n+1}$ \nLet $\\frac{x_{1}}{n}=p_{1}+F_{1} \\cdot \\frac{x_{i}}{n}=p_{i}+F_{i}$ etc. $i=1, n+1$. \n$p_{i} \\in \\mathbb{N}, F_{i}$ is the fractional remainders and clearly $F_{i}$ takes one of the values $F_{c}=0 / n, 1 / n, \\cdots \\text { or } n-1 / n$ \nHence there are $n$ usable distinct values for $(n+1) F_{i} \\Rightarrow 2$ of the $F_{i}$ must be equal, $F^{\\prime}$ and $F^{\\prime \\prime}$ say.",
28-
"images": []
29-
},
30-
{
31-
"question_content": "4. Show that the composition of two injective maps is injective, and the composition of two surjective maps is surjective. Deduce that the composition of two bijective maps is bijective.",
32-
"solution_content": "4. Let $f: A \\rightarrow B, g: B \\rightarrow C$. \n(a) If $f, g$ are surjective. If $c \\in C$, then $\\exists \\quad b \\in B$ such that $g(b)=c$. Also $f$ is surjective so $\\exists a \\in A$ such that $f(a)=b$. \nSo $(g \\circ f)(a)=g(f(a))=g(b)=c$ hence $g \\circ f$ is SURJECTIVE. \n(b) If $f, g$ are injective. $a, a^{\\prime} \\in A$ and $y \\quad a \\neq a^{\\prime}$, then $f(a) \\neq f(a^{\\prime})$, since $f$ injective. Since $g$ is also injective, $g(f(a)) \\neq g(f(a^{\\prime}))$, so $g \\circ f$ is injective.",
33-
"images": []
34-
},
35-
{
36-
"question_content": "5. In the following question you may assume that the domains and codomains of the functions $f, g, h$ are suitably defined and that inverses exist. Show that\n(i) $(f \\circ g)^{-1}=g^{-1} \\circ f^{-1}$\n(ii) $f \\circ(g \\circ h)=(f \\circ g) \\circ h$",
37-
"solution_content": "5. (i) \n$$\n\\begin{aligned}\n& y=(f \\circ g)^{-1}(x) \\\\\n\\Rightarrow \\quad & x =(f \\circ g)(y) \\\\\n& =f(g(y)) \\\\\n\\Rightarrow \\quad & f^{-1}(x)=g(y) \\\\\n\\Rightarrow \\quad & g^{-1}(f^{-1}(x))=y\n\\end{aligned}\n$$\nComparing (1) and (2) $\\Rightarrow (f \\circ g)^{-1}=g^{-1} \\circ f^{-1}$. \n(ii) $(f \\circ(g \\circ h))(x)$ \n$$\n\\begin{aligned}\n& =f((g \\circ h)(x)) \\\\\n& =f(g(h(x)) \\\\\n& =(f \\circ g)(h(x)) \\\\\n& =((f \\circ g) \\circ h)(x)\n\\end{aligned}\n$$\nso $f \\circ(g \\circ h)=(f \\circ g) \\circ h$.",
38-
"images": []
39-
}
40-
]
41-
}

0 commit comments

Comments
 (0)