|
|
|
@ -0,0 +1,956 @@
|
|
|
|
|
{
|
|
|
|
|
"cells": [
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
|
|
|
|
"# Cook Better data\n"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
|
|
|
|
"## ❌ Outdated plan\n",
|
|
|
|
|
"- Our current data is just general QA as the question and answer is extracted directly from one document\n",
|
|
|
|
|
"- The plan is to cook better and harder data, about 300 to 500 questions - answers pairs, should be hard that the model need multihop, should take Musique paper as reference. The retrival environment should also be good enough (must be noisy enough)\n",
|
|
|
|
|
"- Or just training follow related works (use the train set of musique, or nq + hotpotqa)\n",
|
|
|
|
|
"- For example: ReSearch: \"We only use the training set (19938 samples) of MuSiQue for training, and the number of training epochs is 2.\" -> this is a lot of data compare to our current experiment\n",
|
|
|
|
|
"- Multiple difficulty level would also be good"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
|
|
|
|
"### Noise Maker\n",
|
|
|
|
|
"- Search for \"Nguyen Van A\" -> got 10 \"Nguyen Van A (cook, teacher, police,... etc)\" -> the model will be confused and try to correct its query to be more specific\n",
|
|
|
|
|
"- Ohhh in Musicque paper they call this \"Distractor\" (20 paragraphs)\n",
|
|
|
|
|
"-> So bascially we are trying to reinvent the wheel here, but our wheel is not as good as this 💀"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
|
|
|
|
"### Some meh scenarios\n",
|
|
|
|
|
"- bro, i gave you the correct passges, but you still get it wrong?? -> must be reading comprehension issue\n",
|
|
|
|
|
"- bro, you didn't retriveed the correct context passages, but you still able to get the answer right?? -> bro is contaminated\n",
|
|
|
|
|
" \n",
|
|
|
|
|
"- when the model answer wrong, is it because it can not search fo the correct passages? or the correct passges were retrieved but the model still get it wrong? - Thanks @new5558\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"```\n",
|
|
|
|
|
"There are 4 main metrics to eval when evaluating RAGS based summarization/deep reserach system.\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"Context Relevancy: Check if retrieved documents can answer the question\n",
|
|
|
|
|
"Faithfulness: Check if the result from LLM can be found in the retrived documents\n",
|
|
|
|
|
"Answer Relevancy: Check if the result from LLM is related to the question\n",
|
|
|
|
|
"Comparison with answer: Check if the result from LLM is same as the ground truth\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"All four metrics can be eval using human/llm as a judge.\n",
|
|
|
|
|
"```"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
|
|
|
|
"## ✅ Just try musique as this dataset is so peak 🔥\n",
|
|
|
|
|
"below is the first sample of the dev set of musique (answerable)\n",
|
|
|
|
|
"- id\n",
|
|
|
|
|
"- paragraphs\n",
|
|
|
|
|
" - **is_supporting**: if true, this paragraph is the supporting evidence for the answer, else it is a distractor\n",
|
|
|
|
|
"- question: the question\n",
|
|
|
|
|
"- answer: the answer\n",
|
|
|
|
|
"- question_decomposition:\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"> MuSiQue constitutes unique 21020 single-hop questions, 4132 answers to multihop questions, 19841 answers to singlehop questions, and 7676 supporting paragraphs. MuSiQue has 6 types of reasoning graphs and 2-4 hops\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"-> can categorize: 2 hops -> easy, 3 hops -> medium, 4 hops -> hard\n",
|
|
|
|
|
"- id structure: xhop_<question_id_1>_<question_id_2>_..._<question_id_x>\n",
|
|
|
|
|
"- cool prefix: 2hop_, 3hop1, 3hop2, 4hop1, 4hop2, 4hop3\n",
|
|
|
|
|
"- 2hop, 3hop1, 4hop1 is linear\n",
|
|
|
|
|
"- 3hop2, 4hop2, 4hop3 is non-linear\n",
|
|
|
|
|
"- each decomposed question is created from one paragraph, which is nice!\n",
|
|
|
|
|
"- follow related works, we won't use the decomposition questions, but only the combined question and the final \n",
|
|
|
|
|
"- 💡 Not mentioned in other works: beside answer, there is a propety called \"answer_aliases\", which is a list. -> i think i should make the final answer look like this: \"Answer OR Alias01 OR Alias02,...\" as we will use llm for correctness, not exact match\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"for the passages database side:\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"- what should i do with the title of the support passages??? -> concat them at the beginning of the paragraph, as that's what flashrag return (i should mimic this behavior)\n",
|
|
|
|
|
"- use faiss, but need to change to the same embedding setting as flashrag (E5-base-v2, bla bla)\n",
|
|
|
|
|
"- there are random unicode like \"\\u2013\" in passeges -> need to handle?\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"### Quick eda train set\n",
|
|
|
|
|
"Total rows: 19938\n",
|
|
|
|
|
"Number of unique IDs: 19938 ✅ ok this is legit, same as the paper claimed\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"Prefix counts:\n",
|
|
|
|
|
"- 2hop_: 14376\n",
|
|
|
|
|
"- 3hop2_: 650\n",
|
|
|
|
|
"- 3hop1_: 3737\n",
|
|
|
|
|
"- 4hop1_: 648\n",
|
|
|
|
|
"- 4hop2_: 127\n",
|
|
|
|
|
"- 4hop3_: 400\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"Percentage distribution:\n",
|
|
|
|
|
"- 2hop_: 72.10%\n",
|
|
|
|
|
"- 3hop2_: 3.26%\n",
|
|
|
|
|
"- 3hop1_: 18.74%\n",
|
|
|
|
|
"- 4hop1_: 3.25%\n",
|
|
|
|
|
"- 4hop2_: 0.64%\n",
|
|
|
|
|
"- 4hop3_: 2.01%\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"### Support passages length distribution\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"- Why am I caculating this? https://unsloth.ai/blog/grpo \"Long-context GRPO\"\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"> 1 token ~= 4 chars in English. 1 token ~= ¾ words. 100 tokens ~= 75 words. Or. 1-2 sentence ~= 30 tokens. 1 paragraph ~= 100 tokens. 1,500 words ~= 2048 tokens. - OpenAI\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"> With a GRPO setup using TRL + FA2, Llama 3.1 (8B) training at 20K context length demands 510.8GB of VRAM. However, Unsloth’s 90% VRAM reduction brings the requirement down to just 54.3GB in the same setup.\n",
|
|
|
|
|
"- https://github.com/agentica-project/rllm\n",
|
|
|
|
|
" > [2025/02/10] We release DeepScaleR-1.5B-Preview, a 1.5B model that surpasses O1-Preview and achieves 43.1% Pass@1 on AIME. We achieve this by iteratively scaling Deepseek's GRPO algorithm from 8K→16K->24K context length for thinking. As part of this release, we open-source: \n",
|
|
|
|
|
"\n",
|
|
|
|
|
" -> **SIZE MATTER! LONGER IS BETTER!**\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"https://huggingface.co/docs/trl/en/dpo_trainer#trl.DPOTrainer.tokenize_row.max_prompt_length\n",
|
|
|
|
|
"\n",
|
|
|
|
|
" Total samples: 19938\n",
|
|
|
|
|
"Answerable samples: 19938 (100.00%)\n",
|
|
|
|
|
"Samples with non-empty answer_aliases: 5067 (25.41%)\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"Paragraph length statistics (words):\n",
|
|
|
|
|
" Min: 19\n",
|
|
|
|
|
" Max: 299\n",
|
|
|
|
|
" ℹ️ℹ️ Mean: 79.02 -> **~ 100 - 110 tokens -> k5 -> 500 - 600 tokens for 1 query -> 4 hops -> 2000 ~ 2500 tokens (happy scenario when the model was able to complete in 4 hops)**\n",
|
|
|
|
|
" Median: 68.00\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"Paragraph length statistics (characters):\n",
|
|
|
|
|
" Min: 100\n",
|
|
|
|
|
" Max: 2000\n",
|
|
|
|
|
" Mean: 481.69\n",
|
|
|
|
|
" Median: 414.00\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"Title length statistics (words):\n",
|
|
|
|
|
" Min: 1\n",
|
|
|
|
|
" Max: 23\n",
|
|
|
|
|
" Mean: 2.98\n",
|
|
|
|
|
" Median: 3.00\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"Answer length statistics (words):\n",
|
|
|
|
|
" Min: 1\n",
|
|
|
|
|
" Max: 14\n",
|
|
|
|
|
" Mean: 2.42\n",
|
|
|
|
|
" Median: 2.00\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"Question length statistics (words):\n",
|
|
|
|
|
" Min: 4\n",
|
|
|
|
|
" Max: 46\n",
|
|
|
|
|
" Mean: 15.96\n",
|
|
|
|
|
" Median: 15.00\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"Question length statistics (characters):\n",
|
|
|
|
|
" Min: 29\n",
|
|
|
|
|
" ℹ️ℹ️ℹ Max: 283 -> the prompt is currently about 200 words long -> plus this is about 500 - 600 max_prompt_tokens for just starting prompt. but the prompt size will gradually increase as the agent loop continue (think + query + docs)\n",
|
|
|
|
|
" Mean: 89.97\n",
|
|
|
|
|
" Median: 83.00\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"Question length by hop count (words):\n",
|
|
|
|
|
" 2hop: 13.91 words (n=14376)\n",
|
|
|
|
|
" 3hop: 19.85 words (n=4387)\n",
|
|
|
|
|
" 4hop: 26.56 words (n=1175)\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"Unique characters in questions (lowercased):\n",
|
|
|
|
|
" !\"#$&'()+,-./0123456789:;=?[]`abcdefghijklmnopqrstuvwxyz£²×ßàáâãäåçèéêëìíïñòóôõöøúüýāăćčđēěğģīıłńňōřśşšũūżžșțə̇ạầặếễệọốộủỳ–—‘’\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"Unique characters in answers (lowercased):\n",
|
|
|
|
|
" \"$%&'()+,-./0123456789:`abcdefghijklmnopqrstuvwxyz¡¢£¤¥§¨©ª¬°±³¶¸ºáâãäæè\n",
|
|
|
|
|
" - bro wtf is \"\"????\n"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
|
|
|
|
"## ✅ Feel the data "
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"# Quick id distribution\n",
|
|
|
|
|
"import json\n",
|
|
|
|
|
"import os\n",
|
|
|
|
|
"from collections import Counter\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Path to the jsonl file\n",
|
|
|
|
|
"file_path = \"../data/raw/musique_ans_v1.0_train.jsonl\"\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Initialize counters\n",
|
|
|
|
|
"total_rows = 0\n",
|
|
|
|
|
"unique_ids = set()\n",
|
|
|
|
|
"prefix_counts = Counter()\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Read the jsonl file\n",
|
|
|
|
|
"with open(file_path, 'r', encoding='utf-8') as f:\n",
|
|
|
|
|
" for line in f:\n",
|
|
|
|
|
" total_rows += 1\n",
|
|
|
|
|
" data = json.loads(line)\n",
|
|
|
|
|
" unique_ids.add(data['id'])\n",
|
|
|
|
|
" \n",
|
|
|
|
|
" # Count prefixes\n",
|
|
|
|
|
" for prefix in ['2hop_', '3hop1_', '3hop2_', '4hop1_', '4hop2_', '4hop3_']:\n",
|
|
|
|
|
" if data['id'].startswith(prefix):\n",
|
|
|
|
|
" prefix_counts[prefix] += 1\n",
|
|
|
|
|
" break\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Print results\n",
|
|
|
|
|
"print(f\"Total rows: {total_rows}\")\n",
|
|
|
|
|
"print(f\"Number of unique IDs: {len(unique_ids)}\")\n",
|
|
|
|
|
"print(\"\\nPrefix counts:\")\n",
|
|
|
|
|
"for prefix, count in prefix_counts.items():\n",
|
|
|
|
|
" print(f\"- {prefix}: {count}\")\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Calculate percentage for each prefix\n",
|
|
|
|
|
"print(\"\\nPercentage distribution:\")\n",
|
|
|
|
|
"for prefix, count in prefix_counts.items():\n",
|
|
|
|
|
" percentage = (count / total_rows) * 100\n",
|
|
|
|
|
" print(f\"- {prefix}: {percentage:.2f}%\")\n"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"# length distribution of support passages\n",
|
|
|
|
|
"import matplotlib.pyplot as plt\n",
|
|
|
|
|
"import numpy as np\n",
|
|
|
|
|
"import re\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Path to the jsonl file\n",
|
|
|
|
|
"file_path = \"../data/raw/musique_ans_v1.0_train.jsonl\"\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Initialize lists to store lengths\n",
|
|
|
|
|
"para_word_lengths = []\n",
|
|
|
|
|
"para_char_lengths = []\n",
|
|
|
|
|
"title_word_lengths = []\n",
|
|
|
|
|
"title_char_lengths = []\n",
|
|
|
|
|
"answer_word_lengths = []\n",
|
|
|
|
|
"answer_char_lengths = []\n",
|
|
|
|
|
"question_word_lengths = []\n",
|
|
|
|
|
"question_char_lengths = []\n",
|
|
|
|
|
"non_empty_aliases_count = 0\n",
|
|
|
|
|
"question_chars = set()\n",
|
|
|
|
|
"answer_chars = set()\n",
|
|
|
|
|
"answerable_count = 0\n",
|
|
|
|
|
"total_samples = 0\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# For hop analysis\n",
|
|
|
|
|
"hop_question_lengths = {\n",
|
|
|
|
|
" '2hop': [],\n",
|
|
|
|
|
" '3hop': [],\n",
|
|
|
|
|
" '4hop': []\n",
|
|
|
|
|
"}\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Read the jsonl file\n",
|
|
|
|
|
"with open(file_path, 'r', encoding='utf-8') as f:\n",
|
|
|
|
|
" for line in f:\n",
|
|
|
|
|
" data = json.loads(line)\n",
|
|
|
|
|
" total_samples += 1\n",
|
|
|
|
|
" \n",
|
|
|
|
|
" # Check if answerable\n",
|
|
|
|
|
" if data.get('answerable', False):\n",
|
|
|
|
|
" answerable_count += 1\n",
|
|
|
|
|
" \n",
|
|
|
|
|
" # Check answer aliases\n",
|
|
|
|
|
" if data.get('answer_aliases') and len(data['answer_aliases']) > 0:\n",
|
|
|
|
|
" non_empty_aliases_count += 1\n",
|
|
|
|
|
" \n",
|
|
|
|
|
" # Collect unique characters in questions and answers\n",
|
|
|
|
|
" if 'question' in data:\n",
|
|
|
|
|
" question = data['question']\n",
|
|
|
|
|
" question_chars.update(question.lower())\n",
|
|
|
|
|
" \n",
|
|
|
|
|
" # Count words and characters in question\n",
|
|
|
|
|
" question_words = question.split()\n",
|
|
|
|
|
" question_word_lengths.append(len(question_words))\n",
|
|
|
|
|
" question_char_lengths.append(len(question))\n",
|
|
|
|
|
" \n",
|
|
|
|
|
" # Extract hop count from ID for analysis\n",
|
|
|
|
|
" if 'id' in data:\n",
|
|
|
|
|
" if data['id'].startswith('2hop'):\n",
|
|
|
|
|
" hop_question_lengths['2hop'].append(len(question_words))\n",
|
|
|
|
|
" elif data['id'].startswith('3hop'):\n",
|
|
|
|
|
" hop_question_lengths['3hop'].append(len(question_words))\n",
|
|
|
|
|
" elif data['id'].startswith('4hop'):\n",
|
|
|
|
|
" hop_question_lengths['4hop'].append(len(question_words))\n",
|
|
|
|
|
" \n",
|
|
|
|
|
" if 'answer' in data:\n",
|
|
|
|
|
" # Handle unicode escape sequences\n",
|
|
|
|
|
" answer = data['answer'].encode().decode('unicode_escape')\n",
|
|
|
|
|
" answer_chars.update(answer.lower())\n",
|
|
|
|
|
" \n",
|
|
|
|
|
" # Count words and characters in answer\n",
|
|
|
|
|
" answer_words = answer.split()\n",
|
|
|
|
|
" answer_word_lengths.append(len(answer_words))\n",
|
|
|
|
|
" answer_char_lengths.append(len(answer))\n",
|
|
|
|
|
" \n",
|
|
|
|
|
" # Process paragraphs\n",
|
|
|
|
|
" for para in data.get('paragraphs', []):\n",
|
|
|
|
|
" if 'paragraph_text' in para:\n",
|
|
|
|
|
" # Handle unicode escape sequences\n",
|
|
|
|
|
" text = para['paragraph_text'].encode().decode('unicode_escape')\n",
|
|
|
|
|
" words = text.split()\n",
|
|
|
|
|
" para_word_lengths.append(len(words))\n",
|
|
|
|
|
" para_char_lengths.append(len(text))\n",
|
|
|
|
|
" \n",
|
|
|
|
|
" if 'title' in para:\n",
|
|
|
|
|
" # Handle unicode escape sequences\n",
|
|
|
|
|
" title = para['title'].encode().decode('unicode_escape')\n",
|
|
|
|
|
" title_words = title.split()\n",
|
|
|
|
|
" title_word_lengths.append(len(title_words))\n",
|
|
|
|
|
" title_char_lengths.append(len(title))\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Create a figure with subplots\n",
|
|
|
|
|
"fig, axs = plt.subplots(4, 2, figsize=(15, 20))\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Plot paragraph length distributions\n",
|
|
|
|
|
"axs[0, 0].hist(para_word_lengths, bins=50, alpha=0.7)\n",
|
|
|
|
|
"axs[0, 0].set_title('Paragraph Length (Words)')\n",
|
|
|
|
|
"axs[0, 0].set_xlabel('Number of Words')\n",
|
|
|
|
|
"axs[0, 0].set_ylabel('Frequency')\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"axs[0, 1].hist(para_char_lengths, bins=50, alpha=0.7)\n",
|
|
|
|
|
"axs[0, 1].set_title('Paragraph Length (Characters)')\n",
|
|
|
|
|
"axs[0, 1].set_xlabel('Number of Characters')\n",
|
|
|
|
|
"axs[0, 1].set_ylabel('Frequency')\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Plot title length distributions\n",
|
|
|
|
|
"axs[1, 0].hist(title_word_lengths, bins=30, alpha=0.7)\n",
|
|
|
|
|
"axs[1, 0].set_title('Title Length (Words)')\n",
|
|
|
|
|
"axs[1, 0].set_xlabel('Number of Words')\n",
|
|
|
|
|
"axs[1, 0].set_ylabel('Frequency')\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"axs[1, 1].hist(title_char_lengths, bins=30, alpha=0.7)\n",
|
|
|
|
|
"axs[1, 1].set_title('Title Length (Characters)')\n",
|
|
|
|
|
"axs[1, 1].set_xlabel('Number of Characters')\n",
|
|
|
|
|
"axs[1, 1].set_ylabel('Frequency')\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Plot answer length distributions\n",
|
|
|
|
|
"axs[2, 0].hist(answer_word_lengths, bins=30, alpha=0.7)\n",
|
|
|
|
|
"axs[2, 0].set_title('Answer Length (Words)')\n",
|
|
|
|
|
"axs[2, 0].set_xlabel('Number of Words')\n",
|
|
|
|
|
"axs[2, 0].set_ylabel('Frequency')\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"axs[2, 1].hist(answer_char_lengths, bins=30, alpha=0.7)\n",
|
|
|
|
|
"axs[2, 1].set_title('Answer Length (Characters)')\n",
|
|
|
|
|
"axs[2, 1].set_xlabel('Number of Characters')\n",
|
|
|
|
|
"axs[2, 1].set_ylabel('Frequency')\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Plot question length distributions\n",
|
|
|
|
|
"axs[3, 0].hist(question_word_lengths, bins=30, alpha=0.7)\n",
|
|
|
|
|
"axs[3, 0].set_title('Question Length (Words)')\n",
|
|
|
|
|
"axs[3, 0].set_xlabel('Number of Words')\n",
|
|
|
|
|
"axs[3, 0].set_ylabel('Frequency')\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Plot question length by hop count\n",
|
|
|
|
|
"hop_labels = ['2hop', '3hop', '4hop']\n",
|
|
|
|
|
"hop_means = [np.mean(hop_question_lengths[hop]) for hop in hop_labels]\n",
|
|
|
|
|
"hop_counts = [len(hop_question_lengths[hop]) for hop in hop_labels]\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"axs[3, 1].bar(hop_labels, hop_means, alpha=0.7)\n",
|
|
|
|
|
"axs[3, 1].set_title('Average Question Length by Hop Count')\n",
|
|
|
|
|
"axs[3, 1].set_xlabel('Hop Count')\n",
|
|
|
|
|
"axs[3, 1].set_ylabel('Average Number of Words')\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Add count labels on top of bars\n",
|
|
|
|
|
"for i, (count, mean) in enumerate(zip(hop_counts, hop_means)):\n",
|
|
|
|
|
" axs[3, 1].text(i, mean + 0.5, f'n={count}\\n{mean:.1f}', ha='center')\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"plt.tight_layout()\n",
|
|
|
|
|
"plt.show()\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Print statistics\n",
|
|
|
|
|
"print(f\"Total samples: {total_samples}\")\n",
|
|
|
|
|
"print(f\"Answerable samples: {answerable_count} ({answerable_count/total_samples*100:.2f}%)\")\n",
|
|
|
|
|
"print(f\"Samples with non-empty answer_aliases: {non_empty_aliases_count} ({non_empty_aliases_count/total_samples*100:.2f}%)\")\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Print paragraph length statistics\n",
|
|
|
|
|
"print(\"\\nParagraph length statistics (words):\")\n",
|
|
|
|
|
"print(f\" Min: {min(para_word_lengths)}\")\n",
|
|
|
|
|
"print(f\" Max: {max(para_word_lengths)}\")\n",
|
|
|
|
|
"print(f\" Mean: {np.mean(para_word_lengths):.2f}\")\n",
|
|
|
|
|
"print(f\" Median: {np.median(para_word_lengths):.2f}\")\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"print(\"\\nParagraph length statistics (characters):\")\n",
|
|
|
|
|
"print(f\" Min: {min(para_char_lengths)}\")\n",
|
|
|
|
|
"print(f\" Max: {max(para_char_lengths)}\")\n",
|
|
|
|
|
"print(f\" Mean: {np.mean(para_char_lengths):.2f}\")\n",
|
|
|
|
|
"print(f\" Median: {np.median(para_char_lengths):.2f}\")\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Print title length statistics\n",
|
|
|
|
|
"print(\"\\nTitle length statistics (words):\")\n",
|
|
|
|
|
"print(f\" Min: {min(title_word_lengths)}\")\n",
|
|
|
|
|
"print(f\" Max: {max(title_word_lengths)}\")\n",
|
|
|
|
|
"print(f\" Mean: {np.mean(title_word_lengths):.2f}\")\n",
|
|
|
|
|
"print(f\" Median: {np.median(title_word_lengths):.2f}\")\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Print question length statistics\n",
|
|
|
|
|
"print(\"\\nQuestion length statistics (words):\")\n",
|
|
|
|
|
"print(f\" Min: {min(question_word_lengths)}\")\n",
|
|
|
|
|
"print(f\" Max: {max(question_word_lengths)}\")\n",
|
|
|
|
|
"print(f\" Mean: {np.mean(question_word_lengths):.2f}\")\n",
|
|
|
|
|
"print(f\" Median: {np.median(question_word_lengths):.2f}\")\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"print(\"\\nQuestion length statistics (characters):\")\n",
|
|
|
|
|
"print(f\" Min: {min(question_char_lengths)}\")\n",
|
|
|
|
|
"print(f\" Max: {max(question_char_lengths)}\")\n",
|
|
|
|
|
"print(f\" Mean: {np.mean(question_char_lengths):.2f}\")\n",
|
|
|
|
|
"print(f\" Median: {np.median(question_char_lengths):.2f}\")\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Print question length by hop count\n",
|
|
|
|
|
"print(\"\\nQuestion length by hop count (words):\")\n",
|
|
|
|
|
"for hop in hop_labels:\n",
|
|
|
|
|
" print(f\" {hop}: {np.mean(hop_question_lengths[hop]):.2f} words (n={len(hop_question_lengths[hop])})\")\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Print answer length statistics\n",
|
|
|
|
|
"print(\"\\nAnswer length statistics (words):\")\n",
|
|
|
|
|
"print(f\" Min: {min(answer_word_lengths)}\")\n",
|
|
|
|
|
"print(f\" Max: {max(answer_word_lengths)}\")\n",
|
|
|
|
|
"print(f\" Mean: {np.mean(answer_word_lengths):.2f}\")\n",
|
|
|
|
|
"print(f\" Median: {np.median(answer_word_lengths):.2f}\")\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Print unique characters\n",
|
|
|
|
|
"print(\"\\nUnique characters in questions (lowercased):\")\n",
|
|
|
|
|
"print(''.join(sorted(question_chars)))\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"print(\"\\nUnique characters in answers (lowercased):\")\n",
|
|
|
|
|
"print(''.join(sorted(answer_chars)))\n"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"# Find samples with suspicious characters in questions and answers\n",
|
|
|
|
|
"import json\n",
|
|
|
|
|
"import random\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Define suspicious characters (non-ASCII and special characters)\n",
|
|
|
|
|
"suspicious_chars = \"\"\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Function to check if text contains any suspicious characters\n",
|
|
|
|
|
"def contains_suspicious_chars(text, chars_to_check):\n",
|
|
|
|
|
" return any(char in text for char in chars_to_check)\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Lists to store samples with suspicious characters\n",
|
|
|
|
|
"question_samples = []\n",
|
|
|
|
|
"answer_samples = []\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Read the jsonl file again to find examples\n",
|
|
|
|
|
"with open(file_path, 'r', encoding='utf-8') as f:\n",
|
|
|
|
|
" for line in f:\n",
|
|
|
|
|
" data = json.loads(line)\n",
|
|
|
|
|
" \n",
|
|
|
|
|
" # Check question\n",
|
|
|
|
|
" if 'question' in data and contains_suspicious_chars(data['question'].lower(), suspicious_chars):\n",
|
|
|
|
|
" question_samples.append({\n",
|
|
|
|
|
" 'id': data.get('id', 'unknown'),\n",
|
|
|
|
|
" 'question': data['question'],\n",
|
|
|
|
|
" 'suspicious_chars': [char for char in data['question'] if char.lower() in suspicious_chars]\n",
|
|
|
|
|
" })\n",
|
|
|
|
|
" \n",
|
|
|
|
|
" # Check answer\n",
|
|
|
|
|
" if 'answer' in data and contains_suspicious_chars(data['answer'].lower(), suspicious_chars):\n",
|
|
|
|
|
" answer_samples.append({\n",
|
|
|
|
|
" 'id': data.get('id', 'unknown'),\n",
|
|
|
|
|
" 'answer': data['answer'],\n",
|
|
|
|
|
" 'suspicious_chars': [char for char in data['answer'] if char.lower() in suspicious_chars]\n",
|
|
|
|
|
" })\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Print some samples with suspicious characters in questions\n",
|
|
|
|
|
"print(f\"Found {len(question_samples)} samples with suspicious characters in questions\")\n",
|
|
|
|
|
"if question_samples:\n",
|
|
|
|
|
" samples_to_show = min(5, len(question_samples))\n",
|
|
|
|
|
" print(f\"\\nShowing {samples_to_show} random samples with suspicious characters in questions:\")\n",
|
|
|
|
|
" for sample in random.sample(question_samples, samples_to_show):\n",
|
|
|
|
|
" print(f\"ID: {sample['id']}\")\n",
|
|
|
|
|
" print(f\"Question: {sample['question']}\")\n",
|
|
|
|
|
" print(f\"Suspicious characters: {', '.join(set(sample['suspicious_chars']))}\")\n",
|
|
|
|
|
" print()\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"# Print some samples with suspicious characters in answers\n",
|
|
|
|
|
"print(f\"Found {len(answer_samples)} samples with suspicious characters in answers\")\n",
|
|
|
|
|
"if answer_samples:\n",
|
|
|
|
|
" samples_to_show = min(5, len(answer_samples))\n",
|
|
|
|
|
" print(f\"\\nShowing {samples_to_show} random samples with suspicious characters in answers:\")\n",
|
|
|
|
|
" for sample in random.sample(answer_samples, samples_to_show):\n",
|
|
|
|
|
" print(f\"ID: {sample['id']}\")\n",
|
|
|
|
|
" print(f\"Answer: {sample['answer']}\")\n",
|
|
|
|
|
" print(f\"Suspicious characters: {', '.join(set(sample['suspicious_chars']))}\")\n",
|
|
|
|
|
" print()\n"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
|
|
|
|
"### Actually touch some samples"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"{\n",
|
|
|
|
|
" \"id\": \"2hop__460946_294723\",\n",
|
|
|
|
|
" \"paragraphs\": [\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 0,\n",
|
|
|
|
|
" \"title\": \"Grant's First Stand\",\n",
|
|
|
|
|
" \"paragraph_text\": 'Grant\\'s First Stand is the debut album by American jazz guitarist Grant Green featuring performances by Green recorded and released on the Blue Note label in 1961. Earlier recordings made by Green for Blue Note were released as \"First Session\" in 2001.',\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 1,\n",
|
|
|
|
|
" \"title\": \"List of show business families\",\n",
|
|
|
|
|
" \"paragraph_text\": \"Actress / director / singer Phylicia Rash\\u0101d is the older sister of performer Debbie Allen, who is married to former NBA basketball player, Norm Nixon. Phylicia Rash\\u0101d is the former spouse of both Victor Willis, former lead singer of the group Village People, and former NFL football player turned sportscaster, Ahmad Rash\\u0101d. Phylicia and Ahmad Rash\\u0101d are the parents of actress Condola Rash\\u0101d.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 5,\n",
|
|
|
|
|
" \"title\": \"Miquette Giraudy\",\n",
|
|
|
|
|
" \"paragraph_text\": \"Miquette Giraudy (born 9 February 1953, Nice, France) is a keyboard player and vocalist, best known for her work in Gong and with her partner Steve Hillage. She and Hillage currently form the core of the ambient band System 7. In addition to her performances in music, she has also worked as an actress, film editor and writer. In each role, she has used different stage names.\",\n",
|
|
|
|
|
" \"is_supporting\": true,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" ...\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 19,\n",
|
|
|
|
|
" \"title\": \"Mok Kwai-lan\",\n",
|
|
|
|
|
" \"paragraph_text\": \"Mok Kwai-lan (; October 15, 1892 \\u2013 November 3, 1982) was the fourth spouse of Lingnan martial arts grandmaster Wong Fei-hung.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" ],\n",
|
|
|
|
|
" \"question\": \"Who is the spouse of the Green performer?\",\n",
|
|
|
|
|
" \"question_decomposition\": [\n",
|
|
|
|
|
" {\"id\": 460946, \"question\": \"Green >> performer\", \"answer\": \"Steve Hillage\", \"paragraph_support_idx\": 10},\n",
|
|
|
|
|
" {\"id\": 294723, \"question\": \"#1 >> spouse\", \"answer\": \"Miquette Giraudy\", \"paragraph_support_idx\": 5},\n",
|
|
|
|
|
" ],\n",
|
|
|
|
|
" \"answer\": \"Miquette Giraudy\",\n",
|
|
|
|
|
" \"answer_aliases\": [],\n",
|
|
|
|
|
" \"answerable\": true,\n",
|
|
|
|
|
"}"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"{\n",
|
|
|
|
|
" \"id\": \"3hop1__145427_106426_77199\",\n",
|
|
|
|
|
" \"paragraphs\": [\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 0,\n",
|
|
|
|
|
" \"title\": \"USA Up All Night\",\n",
|
|
|
|
|
" \"paragraph_text\": \"USA Up All Night (also known as Up All Night and Up All Night with Rhonda Shear) is an American cable television series that aired weekly on Friday and Saturday nights on the USA Network. The show aired from 1989 to 1998.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 1,\n",
|
|
|
|
|
" \"title\": \"Pasadena Society of Artists\",\n",
|
|
|
|
|
" \"paragraph_text\": \"The Pasadena Society of Artists, founded in 1925, is one of the longest-running, nonprofit arts organizations in the state of California, USA.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 2,\n",
|
|
|
|
|
" \"title\": \"Franco-Prussian War\",\n",
|
|
|
|
|
" \"paragraph_text\": 'While the republican government was amenable to war reparations or ceding colonial territories in Africa or in South East Asia to Prussia, Favre on behalf of the Government of National Defense, declared on 6 September that France would not \"yield an inch of its territory nor a stone of its fortresses.\" The republic then renewed the declaration of war, called for recruits in all parts of the country and pledged to drive the German troops out of France by a guerre \\u00e0 outrance. Under these circumstances, the Germans had to continue the war, yet could not pin down any proper military opposition in their vicinity. As the bulk of the remaining French armies were digging-in near Paris, the German leaders decided to put pressure upon the enemy by attacking Paris. By September 15, German troops reached the outskirts of the fortified city. On September 19, the Germans surrounded it and erected a blockade, as already established at Metz.',\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 3,\n",
|
|
|
|
|
" \"title\": \"The Longest Night in Shanghai\",\n",
|
|
|
|
|
" \"paragraph_text\": \"The Longest Night in Shanghai () is a 2007 film produced by Japan's Movie Eye Entertainment and directed by Chinese director Zhang Yibai. It is a rare collaboration between China and Japan.\",\n",
|
|
|
|
|
" \"is_supporting\": true,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 4,\n",
|
|
|
|
|
" \"title\": \"Declaration of war by the United States\",\n",
|
|
|
|
|
" \"paragraph_text\": \"The last time the United States declared war on any nation was in 1942, when war was declared against Axis - allied Hungary, Bulgaria, and Romania, because President Franklin Roosevelt thought it was improper to engage in hostilities against a country without a declaration of war. Since then, every American president has used military force without a declaration of war.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 5,\n",
|
|
|
|
|
" \"title\": \"John Trumbull Birthplace\",\n",
|
|
|
|
|
" \"paragraph_text\": \"The John Trumbull Birthplace, also known as the Governor Jonathan Trumbull House, is a historic house museum on the Lebanon Green in Lebanon, Connecticut. Built in 1735 by Joseph Trumbull as a wedding present for his son Jonathan (1710-1785), the house was a center of political and military strategy during the American Revolutionary War, when Jonathan Trumbull was Governor of Connecticut. It was also the birthplace of John Trumbull (1756-1843), an artist known for his depictions of the war and its people. The house was designated a National Historic Landmark in 1965.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 6,\n",
|
|
|
|
|
" \"title\": \"Annibale Bergonzoli\",\n",
|
|
|
|
|
" \"paragraph_text\": 'Annibale Bergonzoli (1 November 1884 \\u2013 31 July 1973), nicknamed \"\"barba elettrica\"\", \"Electric Whiskers\", was an Italian Lieutenant General who served during World War I, the Spanish Civil War and World War II. In 1940 he commanded the defences of Bardia, Libya. In February 1941, after the disastrous Battle of Beda Fomm, Bergonzoli surrendered to Australian forces. He was held as a prisoner in India and the USA before being repatriated to Italy. Bergonzoli settled in his birthplace, Cannobio, and died there in 1973.',\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 7,\n",
|
|
|
|
|
" \"title\": \"Allies of World War II\",\n",
|
|
|
|
|
" \"paragraph_text\": \"The Allies of World War II, called the United Nations from the 1 January 1942 declaration, were the countries that together opposed the Axis powers during the Second World War (1939 -- 1945). The Allies promoted the alliance as seeking to stop German, Japanese and Italian aggression.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 8,\n",
|
|
|
|
|
" \"title\": \"Mardi Gras in the United States\",\n",
|
|
|
|
|
" \"paragraph_text\": \"In 1875, the state of Louisiana declared Mardi Gras a legal holiday. Economic, political, and weather conditions sometimes led to the cancellation of some or all of the major parades, especially during the American Civil War, World War I and World War II, but Carnival has always been observed in the city in some way.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 9,\n",
|
|
|
|
|
" \"title\": \"Shirley Plantation\",\n",
|
|
|
|
|
" \"paragraph_text\": \"Shirley Plantation is an estate located on the north bank of the James River in Charles City County, Virginia, USA. It is located on State Route 5, a scenic byway which runs between the independent cities of Richmond and Williamsburg. Shirley Plantation is the oldest active plantation in Virginia and is the oldest family-owned business in North America, dating back to 1614 with operations starting in 1638. The plantation was added to the National Register in 1969 and declared a National Historic Landmark in 1970.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 10,\n",
|
|
|
|
|
" \"title\": \"Military history of the United States during World War II\",\n",
|
|
|
|
|
" \"paragraph_text\": \"On 11 December 1941, Adolf Hitler and Nazi Germany declared war against the United States, the same day that the United States declared war on Germany and Italy.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 11,\n",
|
|
|
|
|
" \"title\": \"San Miguel de Allende\",\n",
|
|
|
|
|
" \"paragraph_text\": \"San Miguel de Allende (Spanish pronunciation: (san mi'\\u0263el de a'\\u028eende)) is a city and municipality located in the far eastern part of the state of Guanajuato in central Mexico. It is part of the macroregion of Baj\\u00edo. It is 274 km (170 mi) from Mexico City, 86 km (53 mi) from Queretaro, and 97 km (60 mi) from the state capital of Guanajuato. Historically, the town is important as being the birthplace of Mexican General Ignacio Allende, whose surname was added to the town's name in 1826, as well as the first municipality declared independent of Spanish rule by the nascent insurgent army during the Mexican War of Independence. San Miguel de Allende was also a critical epicenter during the historic Chichimeca War (1540 - 1590) where the Chichimeca Confederation defeated the Spanish Empire in the initial colonization war. Today, the town is a proclaimed World Heritage Site, attracting thousands of tourists and new residents from abroad every year.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 12,\n",
|
|
|
|
|
" \"title\": \"One Thrilling Night\",\n",
|
|
|
|
|
" \"paragraph_text\": \"The newlywed country bumpkins from Connecticut, Mr. and Mrs. Horace Jason (John Beal and Wanda McKay), check into the Hotel Clarke in New York City, prepared to spend their first night together as a married couple. It is also their first and last night before Horace joins the Army.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 13,\n",
|
|
|
|
|
" \"title\": \"Battle of the Atlantic\",\n",
|
|
|
|
|
" \"paragraph_text\": \"The Battle of the Atlantic was the longest continuous military campaign in World War II, running from 1939 to the defeat of Nazi Germany in 1945, and was a major part of the Naval history of World War II. At its core was the Allied naval blockade of Germany, announced the day after the declaration of war, and Germany's subsequent counter-blockade. It was at its height from mid-1940 through to the end of 1943.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 14,\n",
|
|
|
|
|
" \"title\": \"Francis Scott Key Bridge (Baltimore)\",\n",
|
|
|
|
|
" \"paragraph_text\": \"The Francis Scott Key Bridge, also known as the Outer Harbor Bridge or simply the Key Bridge, is a steel arch-shaped continuous through truss bridge spanning the Patapsco River in Baltimore, Maryland, USA. The main span of is the third longest span of any continuous truss in the world. It is also the longest bridge in the Baltimore area.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 15,\n",
|
|
|
|
|
" \"title\": \"Military history of Italy during World War II\",\n",
|
|
|
|
|
" \"paragraph_text\": \"On 10 June 1940, as the French government fled to Bordeaux during the German invasion, declaring Paris an open city, Mussolini felt the conflict would soon end and declared war on Britain and France. As he said to the Army's Chief - of - Staff, Marshal Badoglio:\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 16,\n",
|
|
|
|
|
" \"title\": \"Sandia Peak Tramway\",\n",
|
|
|
|
|
" \"paragraph_text\": \"The Sandia Peak Tramway is an aerial tramway located adjacent to Albuquerque, New Mexico, USA. It stretches from the northeast edge of the city to the crestline of the Sandia Mountains and has the world's third longest single span. It is the longest aerial tram in the United States.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 17,\n",
|
|
|
|
|
" \"title\": \"United States declaration of war on Japan\",\n",
|
|
|
|
|
" \"paragraph_text\": \"On December 8, 1941, the United States Congress declared war (Public Law 77 - 328, 55 STAT 795) on the Empire of Japan in response to that country's surprise attack on Pearl Harbor the prior day. It was formulated an hour after the Infamy Speech of US President Franklin D. Roosevelt. Japan had sent a message for the United States to its embassy in Washington earlier, but because of problems at the embassy in decoding the very long message -- the high security level assigned to the declaration meant that only personnel with very high clearances could decode it, which slowed down the process -- it was not delivered to the U.S. Secretary of State until after the Pearl Harbor attack. Following the U.S. declaration, Japan's allies, Germany and Italy, declared war on the United States, bringing the United States fully into World War II.\",\n",
|
|
|
|
|
" \"is_supporting\": true,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 18,\n",
|
|
|
|
|
" \"title\": \"Faces of War Memorial\",\n",
|
|
|
|
|
" \"paragraph_text\": \"Faces Of War Memorial is a Vietnam War memorial located in Roswell, Georgia, USA. It is located on the grounds of Roswell City Hall and was dedicated on January 1, 1998.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 19,\n",
|
|
|
|
|
" \"title\": \"You Wenhui\",\n",
|
|
|
|
|
" \"paragraph_text\": \"You Wenhui (; born October 20, 1979 in Shanghai) is a female Chinese beach volleyball player who competed in the 2004 Summer Olympics.\",\n",
|
|
|
|
|
" \"is_supporting\": true,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" ],\n",
|
|
|
|
|
" \"question\": \"When did the USA declare war on the country that produced The Longest Night in the city where You Wenhui was born?\",\n",
|
|
|
|
|
" \"question_decomposition\": [\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"id\": 145427,\n",
|
|
|
|
|
" \"question\": \"Which city was the birthplace of You Wenhui?\",\n",
|
|
|
|
|
" \"answer\": \"Shanghai\",\n",
|
|
|
|
|
" \"paragraph_support_idx\": 19,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"id\": 106426,\n",
|
|
|
|
|
" \"question\": \"Which was the country for The Longest Night in #1 ?\",\n",
|
|
|
|
|
" \"answer\": \"Japan\",\n",
|
|
|
|
|
" \"paragraph_support_idx\": 3,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"id\": 77199,\n",
|
|
|
|
|
" \"question\": \"when did the usa declare war on #2\",\n",
|
|
|
|
|
" \"answer\": \"December 8, 1941\",\n",
|
|
|
|
|
" \"paragraph_support_idx\": 17,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" ],\n",
|
|
|
|
|
" \"answer\": \"December 8, 1941\",\n",
|
|
|
|
|
" \"answer_aliases\": [],\n",
|
|
|
|
|
" \"answerable\": true,\n",
|
|
|
|
|
"}\n"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"{\n",
|
|
|
|
|
" \"id\": \"3hop2__101905_30152_20999\",\n",
|
|
|
|
|
" \"paragraphs\": [\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 0,\n",
|
|
|
|
|
" \"title\": \"Somalis\",\n",
|
|
|
|
|
" \"paragraph_text\": \"In addition, the Somali community has produced numerous important Muslim figures over the centuries, many of whom have significantly shaped the course of Islamic learning and practice in the Horn of Africa, the Arabian Peninsula and well beyond.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 1,\n",
|
|
|
|
|
" \"title\": \"Umayyad Caliphate\",\n",
|
|
|
|
|
" \"paragraph_text\": \"Non-Muslim groups in the Umayyad Caliphate, which included Christians, Jews, Zoroastrians, and pagan Berbers, were called dhimmis. They were given a legally protected status as second-class citizens as long as they accepted and acknowledged the political supremacy of the ruling Muslims. They were allowed to have their own courts, and were given freedom of their religion within the empire.[citation needed] Although they could not hold the highest public offices in the empire, they had many bureaucratic positions within the government. Christians and Jews still continued to produce great theological thinkers within their communities, but as time wore on, many of the intellectuals converted to Islam, leading to a lack of great thinkers in the non-Muslim communities.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 2,\n",
|
|
|
|
|
" \"title\": \"Somalis\",\n",
|
|
|
|
|
" \"paragraph_text\": \"Somalis (Somali: Soomaali, Arabic: \\u0635\\u0648\\u0645\\u0627\\u0644\\u200e) are an ethnic group inhabiting the Horn of Africa (Somali Peninsula). The overwhelming majority of Somalis speak the Somali language, which is part of the Cushitic branch of the Afro-Asiatic family. They are predominantly Sunni Muslim. Ethnic Somalis number around 16-20 million and are principally concentrated in Somalia (around 12.3 million), Ethiopia (4.6 million), Kenya (2.4 million), and Djibouti (464,600), with many also residing in parts of the Middle East, North America and Europe.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 3,\n",
|
|
|
|
|
" \"title\": \"Visa requirements for New Zealand citizens\",\n",
|
|
|
|
|
" \"paragraph_text\": \"Visa requirements for New Zealand citizens are administrative entry restrictions by the authorities of other states placed on citizens of New Zealand. As of 1 January 2017, New Zealand citizens had visa - free or visa on arrival access to 172 countries and territories, ranking the New Zealand passport 5th in terms of travel freedom (tied with Irish and Japanese passports) according to the Henley visa restrictions index.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 4,\n",
|
|
|
|
|
" \"title\": \"Ottoman Empire\",\n",
|
|
|
|
|
" \"paragraph_text\": \"The discovery of new maritime trade routes by Western European states allowed them to avoid the Ottoman trade monopoly. The Portuguese discovery of the Cape of Good Hope in 1488 initiated a series of Ottoman-Portuguese naval wars in the Indian Ocean throughout the 16th century. The Somali Muslim Ajuran Empire, allied with the Ottomans, defied the Portuguese economic monopoly in the Indian Ocean by employing a new coinage which followed the Ottoman pattern, thus proclaiming an attitude of economic independence in regard to the Portuguese.\",\n",
|
|
|
|
|
" \"is_supporting\": true,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 5,\n",
|
|
|
|
|
" \"title\": \"Somalis\",\n",
|
|
|
|
|
" \"paragraph_text\": \"Somali people in the Horn of Africa are divided among different countries (Somalia, Djibouti, Ethiopia, and northeastern Kenya) that were artificially and some might say arbitrarily partitioned by the former imperial powers. Pan-Somalism is an ideology that advocates the unification of all ethnic Somalis once part of Somali empires such as the Ajuran Empire, the Adal Sultanate, the Gobroon Dynasty and the Dervish State under one flag and one nation. The Siad Barre regime actively promoted Pan-Somalism, which eventually led to the Ogaden War between Somalia on one side, and Ethiopia, Cuba and the Soviet Union on the other.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 6,\n",
|
|
|
|
|
" \"title\": \"Communications in Somalia\",\n",
|
|
|
|
|
" \"paragraph_text\": \"There are a number of radio news agencies based in Somalia. Established during the colonial period, Radio Mogadishu initially broadcast news items in both Somali and Italian. The station was modernized with Russian assistance following independence in 1960, and began offering home service in Somali, Amharic and Oromo. After closing down operations in the early 1990s due to the civil war, the station was officially re-opened in the early 2000s by the Transitional National Government. In the late 2000s, Radio Mogadishu also launched a complementary website of the same name, with news items in Somali, Arabic and English.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 7,\n",
|
|
|
|
|
" \"title\": \"Somalis\",\n",
|
|
|
|
|
" \"paragraph_text\": \"Following World War II, Britain retained control of both British Somaliland and Italian Somaliland as protectorates. In 1945, during the Potsdam Conference, the United Nations granted Italy trusteeship of Italian Somaliland, but only under close supervision and on the condition \\u2014 first proposed by the Somali Youth League (SYL) and other nascent Somali political organizations, such as Hizbia Digil Mirifle Somali (HDMS) and the Somali National League (SNL) \\u2014 that Somalia achieve independence within ten years. British Somaliland remained a protectorate of Britain until 1960.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 8,\n",
|
|
|
|
|
" \"title\": \"Somalis in the United Kingdom\",\n",
|
|
|
|
|
" \"paragraph_text\": \"Somalis in the United Kingdom include British citizens and residents born in, or with ancestors from, Somalia. It is thought that the United Kingdom (UK) is home to the largest Somali community in Europe, with an estimated 98,000 Somali - born immigrants residing in the UK in 2016 according to the Office for National Statistics. The majority of these live in England, with the largest number found in London. Smaller Somali communities exist in Birmingham, Bristol, Manchester, Liverpool, Leicester, Milton Keynes, Sheffield and Cardiff.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 9,\n",
|
|
|
|
|
" \"title\": \"Somalis\",\n",
|
|
|
|
|
" \"paragraph_text\": \"The history of Islam in Somalia is as old as the religion itself. The early persecuted Muslims fled to various places in the region, including the city of Zeila in modern-day northern Somalia, so as to seek protection from the Quraysh. Somalis were among the first populations on the continent to embrace Islam. With very few exceptions, Somalis are entirely Muslims, the majority belonging to the Sunni branch of Islam and the Shafi`i school of Islamic jurisprudence, although a few are also adherents of the Shia Muslim denomination.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 10,\n",
|
|
|
|
|
" \"title\": \"Portugal\",\n",
|
|
|
|
|
" \"paragraph_text\": \"The land within the borders of current Portugal has been continuously settled and fought over since prehistoric times. The Celts and the Romans were followed by the Visigothic and the Suebi Germanic peoples, who were themselves later invaded by the Moors. These Muslim peoples were eventually expelled during the Christian Reconquista of the peninsula. By 1139, Portugal had established itself as a kingdom independent from Le\\u00f3n. In the 15th and 16th centuries, as the result of pioneering the Age of Discovery, Portugal expanded Western influence and established the first global empire, becoming one of the world's major economic, political and military powers.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 11,\n",
|
|
|
|
|
" \"title\": \"Myanmar\",\n",
|
|
|
|
|
" \"paragraph_text\": \"The dynasty regrouped and defeated the Portuguese in 1613 and Siam in 1614. It restored a smaller, more manageable kingdom, encompassing Lower Myanmar, Upper Myanmar, Shan states, Lan Na and upper Tenasserim. The Restored Toungoo kings created a legal and political framework whose basic features would continue well into the 19th century. The crown completely replaced the hereditary chieftainships with appointed governorships in the entire Irrawaddy valley, and greatly reduced the hereditary rights of Shan chiefs. Its trade and secular administrative reforms built a prosperous economy for more than 80 years. From the 1720s onward, the kingdom was beset with repeated Meithei raids into Upper Myanmar and a nagging rebellion in Lan Na. In 1740, the Mon of Lower Myanmar founded the Restored Hanthawaddy Kingdom. Hanthawaddy forces sacked Ava in 1752, ending the 266-year-old Toungoo Dynasty.\",\n",
|
|
|
|
|
" \"is_supporting\": true,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 12,\n",
|
|
|
|
|
" \"title\": \"Somalis\",\n",
|
|
|
|
|
" \"paragraph_text\": \"The Somali flag is an ethnic flag conceived to represent ethnic Somalis. It was created in 1954 by the Somali scholar Mohammed Awale Liban, after he had been selected by the labour trade union of the Trust Territory of Somalia to come up with a design. Upon independence in 1960, the flag was adopted as the national flag of the nascent Somali Republic. The five-pointed Star of Unity in the flag's center represents the Somali ethnic group inhabiting the five territories in Greater Somalia.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 13,\n",
|
|
|
|
|
" \"title\": \"Muslim world\",\n",
|
|
|
|
|
" \"paragraph_text\": \"More than 20% of the world's population is Muslim. Current estimates conclude that the number of Muslims in the world is around 1,5 billion. Muslims are the majority in 49 countries, they speak hundreds of languages and come from diverse ethnic backgrounds. Major languages spoken by Muslims include Arabic, Urdu, Bengali, Punjabi, Malay, Javanese, Sundanese, Swahili, Hausa, Fula, Berber, Tuareg, Somali, Albanian, Bosnian, Russian, Turkish, Azeri, Kazakh, Uzbek, Tatar, Persian, Kurdish, Pashto, Balochi, Sindhi and Kashmiri, among many others.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 14,\n",
|
|
|
|
|
" \"title\": \"Somalis\",\n",
|
|
|
|
|
" \"paragraph_text\": \"The birth of Islam on the opposite side of Somalia's Red Sea coast meant that Somali merchants, sailors and expatriates living in the Arabian Peninsula gradually came under the influence of the new religion through their converted Arab Muslim trading partners. With the migration of fleeing Muslim families from the Islamic world to Somalia in the early centuries of Islam and the peaceful conversion of the Somali population by Somali Muslim scholars in the following centuries, the ancient city-states eventually transformed into Islamic Mogadishu, Berbera, Zeila, Barawa and Merca, which were part of the Berberi civilization. The city of Mogadishu came to be known as the City of Islam, and controlled the East African gold trade for several centuries.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 15,\n",
|
|
|
|
|
" \"title\": \"Mint (facility)\",\n",
|
|
|
|
|
" \"paragraph_text\": \"At about the same time, coins and mints appeared independently in China and spread to Korea and Japan. The manufacture of coins in the Roman Empire, dating from about the 4th century BC, significantly influenced later development of coin minting in Europe.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 16,\n",
|
|
|
|
|
" \"title\": \"Somalis\",\n",
|
|
|
|
|
" \"paragraph_text\": \"Growing out of the Somali people's rich storytelling tradition, the first few feature-length Somali films and cinematic festivals emerged in the early 1960s, immediately after independence. Following the creation of the Somali Film Agency (SFA) regulatory body in 1975, the local film scene began to expand rapidly. The Somali filmmaker Ali Said Hassan concurrently served as the SFA's representative in Rome. In the 1970s and early 1980s, popular musicals known as riwaayado were the main driving force behind the Somali movie industry. Epic and period films as well as international co-productions followed suit, facilitated by the proliferation of video technology and national television networks. Said Salah Ahmed during this period directed his first feature film, The Somali Darwish (The Somalia Dervishes), devoted to the Dervish State. In the 1990s and 2000s, a new wave of more entertainment-oriented movies emerged. Referred to as Somaliwood, this upstart, youth-based cinematic movement has energized the Somali film industry and in the process introduced innovative storylines, marketing strategies and production techniques. The young directors Abdisalam Aato of Olol Films and Abdi Malik Isak are at the forefront of this quiet revolution.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 17,\n",
|
|
|
|
|
" \"title\": \"Somalis\",\n",
|
|
|
|
|
" \"paragraph_text\": \"In 1975, the most prominent government reforms regarding family law in a Muslim country were set in motion in the Somali Democratic Republic, which put women and men, including husbands and wives, on complete equal footing. The 1975 Somali Family Law gave men and women equal division of property between the husband and wife upon divorce and the exclusive right to control by each spouse over his or her personal property.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 18,\n",
|
|
|
|
|
" \"title\": \"Germans\",\n",
|
|
|
|
|
" \"paragraph_text\": \"After World War II, eastern European countries such as the Soviet Union, Poland, Czechoslovakia, Hungary, Romania and Yugoslavia expelled the Germans from their territories. Many of those had inhabited these lands for centuries, developing a unique culture. Germans were also forced to leave the former eastern territories of Germany, which were annexed by Poland (Silesia, Pomerania, parts of Brandenburg and southern part of East Prussia) and the Soviet Union (northern part of East Prussia). Between 12 and 16,5 million ethnic Germans and German citizens were expelled westwards to allied-occupied Germany.\",\n",
|
|
|
|
|
" \"is_supporting\": false,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"idx\": 19,\n",
|
|
|
|
|
" \"title\": \"David Htan\",\n",
|
|
|
|
|
" \"paragraph_text\": \"David Htan (; born 13 May 1990) is a burmese professional footballer who plays as a midfielder for Myanmar national football team and Shan United. David Htan suddenly moved to Shan United F.C. in May 2018.\",\n",
|
|
|
|
|
" \"is_supporting\": true,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" ],\n",
|
|
|
|
|
" \"question\": \"How were the people that the Ajuran Empire declared independence from by minting coins expelled from David Htan's country?\",\n",
|
|
|
|
|
" \"question_decomposition\": [\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"id\": 101905,\n",
|
|
|
|
|
" \"question\": \"Of what country is David Htan a citizen?\",\n",
|
|
|
|
|
" \"answer\": \"Myanmar\",\n",
|
|
|
|
|
" \"paragraph_support_idx\": 19,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"id\": 30152,\n",
|
|
|
|
|
" \"question\": \"New coins were a proclamation of independence by the Somali Muslim Ajuran Empire from whom?\",\n",
|
|
|
|
|
" \"answer\": \"the Portuguese\",\n",
|
|
|
|
|
" \"paragraph_support_idx\": 4,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" {\n",
|
|
|
|
|
" \"id\": 20999,\n",
|
|
|
|
|
" \"question\": \"How were the #2 expelled from #1 ?\",\n",
|
|
|
|
|
" \"answer\": \"The dynasty regrouped and defeated the Portuguese\",\n",
|
|
|
|
|
" \"paragraph_support_idx\": 11,\n",
|
|
|
|
|
" },\n",
|
|
|
|
|
" ],\n",
|
|
|
|
|
" \"answer\": \"The dynasty regrouped and defeated the Portuguese\",\n",
|
|
|
|
|
" \"answer_aliases\": [],\n",
|
|
|
|
|
" \"answerable\": true,\n",
|
|
|
|
|
"}\n"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
|
|
|
|
"## Inspect think length of related works\n",
|
|
|
|
|
"### Search-R1 ~ 20 words\n",
|
|
|
|
|
"- <think> I need to find out which city and state a singer, by whom the women’s fragrance ”Curious” was created, was born in. </think>\n",
|
|
|
|
|
"- <think> I need to find out which city and state Britney Spears, the singer behind ”Curious” fragrance, was born in. </think>\n",
|
|
|
|
|
"- <think> Since Britney Spears was born in McComb, Mississippi, I need to find out which city and state that is. </think>\n",
|
|
|
|
|
"### R1-Seacher ~ 30 - 40 words\n",
|
|
|
|
|
"- To solve this question, I need to find out who the president of the United States was in the year that Citibank was founded and then identify that president. Step 1: Determine the year Citibank was founded. I’ll use a search query to find out:\n",
|
|
|
|
|
"- From the search results, I can see that Citibank was founded in 1812. Step 2: Identify the president of the United States in 1812. I’ll use a search query to find out:\n",
|
|
|
|
|
"### ReSearch 30 - 60 words - also make the most sense!\n",
|
|
|
|
|
"- To find the answer, I first need to determine the country in which Martín Ramírez Pech’s political party operates. I will search for \"Martín Ramírez Pech\" to find out this information.\n",
|
|
|
|
|
"- It appears that I made a mistake in my search query. Instead of searching for \"Martín Ramírez Pech\", the results are showing information about a different \"Martín Ramírez\". To find out the right information, I need to look for Martín Ramírez Pech specifically and the political party associated with him. I will modify the search query to \"Martín Ramírez Pech political party\" for the correct information. \n",
|
|
|
|
|
"- Now, we see that Martín Ramírez Pech is associated with the political party known as the Labor Party (PT) in Mexico. I now need to continue this line of reasoning to find out who won the presidential election in 2018 in Mexico. \n",
|
|
|
|
|
"\n",
|
|
|
|
|
"### Ours (DeepSearch 250404) ~30 - 40 words (holyf didn't expect this to be this good) \n",
|
|
|
|
|
"- Q: The flag of the territory that Harry Collinge High School is located in is a subclass of what? (taken from test set of musique)\n",
|
|
|
|
|
"- The information about Harry Collinge High School is required to answer this question. The school is located in the territory of Jersey, a self-governing island within the British Crown.\n",
|
|
|
|
|
"- Based on the provided information about the flag of Jersey, it is clear that the flag of Jersey is the British Red Ensign with a distinctive design.\n",
|
|
|
|
|
"- Based on the information about the flag of Jersey, it seems that the flag is a subclass of the flag of the United Kingdom, specifically the British Red Ensign, but with additional design elements.\n",
|
|
|
|
|
"- Based on the information about the Red Ensign, it seems that the British Red Ensign is a subclass of the flag of the United Kingdom, which includes the Union Jack.\n",
|
|
|
|
|
"- Based on the information about the Union Jack, it seems that the Union Jack is the flag of the United Kingdom, which is a subclass of the flag of the British Commonwealth, but more broadly, it is a subclass of the flag of the United Kingdom's predecessor, the Kingdom of Great Britain.\n",
|
|
|
|
|
"- Based on the information about the flag of Great Britain, it seems that the flag of Great Britain is the precursor to the Union Jack of 1801, and it was used as the flag of the Kingdom of Great Britain from 1707 to 1801.\n",
|
|
|
|
|
"- ℹ️ℹ️ TOKEN LIMIT EXCEEDED (After Iteration): Current 4950, Start 315 -> IT\"S SCALE UP TIME!\n",
|
|
|
|
|
"\n"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
|
|
|
|
"## New Plan\n",
|
|
|
|
|
"- scirpt to prepare musicque (download, unzip, move to the right place, dev subset, bla bla)\n",
|
|
|
|
|
"- script to turn musicque into our format\n",
|
|
|
|
|
"- trainning goes brrrr"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"# Test DeepScaleR-1.5B to see its thinking process\n",
|
|
|
|
|
"from transformers.pipelines import pipeline\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"pipe = pipeline(\"text-generation\", model=\"agentica-org/DeepScaleR-1.5B-Preview\", device=\"cuda:1\", max_length=8192)\n"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"# LOL for this question it took 4,5 minutes to generate the reponse and not yet have the answer (set max_length=8192)\n",
|
|
|
|
|
"question = \"\"\"\n",
|
|
|
|
|
"Each vertex of a regular octagon is independently colored either red or blue with equal probability. The probability that the octagon can then be rotated so that all of the blue vertices end up at positions where there were originally red vertices is $\\tfrac{m}{n}$, where $m$ and $n$ are relatively prime positive integers. What is $m+n$?\n",
|
|
|
|
|
"\"\"\"\n",
|
|
|
|
|
"print(pipe(question))\n",
|
|
|
|
|
"\n"
|
|
|
|
|
]
|
|
|
|
|
}
|
|
|
|
|
],
|
|
|
|
|
"metadata": {
|
|
|
|
|
"kernelspec": {
|
|
|
|
|
"display_name": "deepsearch-py311-2",
|
|
|
|
|
"language": "python",
|
|
|
|
|
"name": "python3"
|
|
|
|
|
},
|
|
|
|
|
"language_info": {
|
|
|
|
|
"codemirror_mode": {
|
|
|
|
|
"name": "ipython",
|
|
|
|
|
"version": 3
|
|
|
|
|
},
|
|
|
|
|
"file_extension": ".py",
|
|
|
|
|
"mimetype": "text/x-python",
|
|
|
|
|
"name": "python",
|
|
|
|
|
"nbconvert_exporter": "python",
|
|
|
|
|
"pygments_lexer": "ipython3",
|
|
|
|
|
"version": "3.11.11"
|
|
|
|
|
}
|
|
|
|
|
},
|
|
|
|
|
"nbformat": 4,
|
|
|
|
|
"nbformat_minor": 2
|
|
|
|
|
}
|