crawl4ai version
v0.8.9
Expected Behavior
Python Version:3.14.5
Here is my code:
import asyncio
from crawl4ai.async_configs import BrowserConfig,CrawlerRunConfig
from crawl4ai.markdown_generation_strategy import DefaultMarkdownGenerator
from crawl4ai import DefaultTableExtraction
from crawl4ai import AsyncWebCrawler,CacheMode
from crawl4ai.content_filter_strategy import PruningContentFilter
target_url = "https://en.wikipedia.org/wiki/List_of_prime_ministers_of_India"
# browser_config
browser_config = BrowserConfig(
headless=True,
user_agent_mode='random',
)
prune_filter = PruningContentFilter(
threshold=0.8,
threshold_type="dynamic",
)
# CrawlerConfig
run_config = CrawlerRunConfig(
magic=True,
markdown_generator=DefaultMarkdownGenerator(
content_source = "raw_html",
options={
'bypass_tables': True,
}
),
cache_mode=CacheMode.BYPASS,
css_selector='table.wikitable',
flatten_shadow_dom= True,
keep_attrs=['rowspan','colspan'],
table_extraction= DefaultTableExtraction()
)
async def main():
async with AsyncWebCrawler(config=browser_config) as crawler:
result = await crawler.arun(url=target_url,config=run_config)
print(result.markdown)
print(result.markdown.fit_markdown)
print(result.tables)
with open('test_raw.md','w') as f:
f.write(result.markdown)
with open('test_fit.md','w') as f:
f.write(result.markdown.fit_markdown)
if __name__ == "__main__":
asyncio.run(main())
Current Behavior
I tried many times ,but neither rowspan nor colspan was involved in result.markdown.
test_raw.md
Is this reproducible?
Yes
Inputs Causing the Bug
- url: "https://en.wikipedia.org/wiki/List_of_prime_ministers_of_India"
- browser_config:
# browser_config
browser_config = BrowserConfig(
headless=True,
user_agent_mode='random',
)
- crawler_config:
# CrawlerConfig
run_config = CrawlerRunConfig(
magic=True,
markdown_generator=DefaultMarkdownGenerator(
content_source = "raw_html",
#content_filter= prune_filter,
options={
'bypass_tables': True,
}
),
cache_mode=CacheMode.BYPASS,
css_selector='table.wikitable',
flatten_shadow_dom= True,
keep_attrs=['rowspan','colspan'],
table_extraction= DefaultTableExtraction()
)
Steps to Reproduce
Code snippets
OS
macOS
Python version
3.14.5
Browser
Chrome
Browser version
No response
Error logs & Screenshots (if applicable)
No response
crawl4ai version
v0.8.9
Expected Behavior
Python Version:3.14.5
Here is my code:
Current Behavior
I tried many times ,but neither
rowspannorcolspanwas involved inresult.markdown.test_raw.md
Is this reproducible?
Yes
Inputs Causing the Bug
Steps to Reproduce
Code snippets
OS
macOS
Python version
3.14.5
Browser
Chrome
Browser version
No response
Error logs & Screenshots (if applicable)
No response