Duckduckgo_search 入门

使用DuckDuckGo.com搜索引擎搜索单词、文档、图片、视频、新闻、地图和文本翻译。将文件和图片下载到本地硬盘。

⚠️ 警告:在异步代码中使用 AsyncDDGS

目录

安装

pip install -U duckduckgo_search

CLI版本

ddgs --help

或者

python -m duckduckgo_search --help

CLI示例:

# 文本搜索
ddgs text -k "ayrton senna"
# 通过代理进行文本搜索(示例:Tor浏览器)
ddgs text -k "china is a global threat" -p socks5://localhost:9150
# 查找并下载pdf文件
ddgs text -k "russia filetype:pdf" -m 50 -d
# 在es-es地区查找并通过代理下载pdf文件(示例:Tor浏览器)
ddgs text -k "embajada a tamorlán filetype:pdf" -r es-es -m 50 -d -p socks5://localhost:9150
# 从特定网站查找并下载xls文件
ddgs text -k "sanctions filetype:xls site:gov.ua" -m 50 -d
# 从特定网站查找并下载任何doc(x)文件
ddgs text -k "filetype:doc site:mos.ru" -m 50 -d
# 查找并下载图片
ddgs images -k "yuri kuklachev cat theatre" -m 500 -s off -d
# 在br-br地区通过代理(示例:Tor浏览器)以10个线程下载图片
ddgs images -k "rio carnival" -r br-br -s off -m 500 -d -th 10 -p socks5://localhost:9150
# 获取最新新闻
ddgs news -k "ukraine war" -s off -t d -m 10
# 获取最近一天的新闻并将其保存到csv文件中
ddgs news -k "hubble telescope" -t d -m 50 -o csv
# 获取答案并保存到json文件中
ddgs answers -k holocaust -o json

Duckduckgo搜索运算符

关键字示例 结果
cats dogs 关于猫或狗的结果
“cats and dogs” 关于确切词组"cats and dogs"的结果。如果没有找到结果,会显示相关结果。
cats -dogs 结果中的狗较少
cats +dogs 结果中的狗较多
cats filetype:pdf 关于猫的PDF文件。支持的文件类型:pdf、doc(x)、xls(x)、ppt(x)、html
dogs site:example.com 来自example.com的关于狗的页面
cats -site:example.com 关于猫的页面,不包括example.com
intitle:dogs 页面标题包含单词"dogs"
inurl:cats 页面网址包含单词"cats"

区域

展开
xa-ar for Arabia
xa-en for Arabia (en)
ar-es for Argentina
au-en for Australia
at-de for Austria
be-fr for Belgium (fr)
be-nl for Belgium (nl)
br-pt for Brazil
bg-bg for Bulgaria
ca-en for Canada
ca-fr for Canada (fr)
ct-ca for Catalan
cl-es for Chile
cn-zh for China
co-es for Colombia
hr-hr for Croatia
cz-cs for Czech Republic
dk-da for Denmark
ee-et for Estonia
fi-fi for Finland
fr-fr for France
de-de for Germany
gr-el for Greece
hk-tzh for Hong Kong
hu-hu for Hungary
in-en for India
id-id for Indonesia
id-en for Indonesia (en)
ie-en for Ireland
il-he for Israel
it-it for Italy
jp-jp for Japan
kr-kr for Korea
lv-lv for Latvia
lt-lt for Lithuania
xl-es for Latin America
my-ms for Malaysia
my-en for Malaysia (en)
mx-es for Mexico
nl-nl for Netherlands
nz-en for New Zealand
no-no for Norway
pe-es for Peru
ph-en for Philippines
ph-tl for Philippines (tl)
pl-pl for Poland
pt-pt for Portugal
ro-ro for Romania
ru-ru for Russia
sg-en for Singapore
sk-sk for Slovak Republic
sl-sl for Slovenia
za-en for South Africa
es-es for Spain
se-sv for Sweden
ch-de for Switzerland (de)
ch-fr for Switzerland (fr)
ch-it for Switzerland (it)
tw-tzh for Taiwan
th-th for Thailand
tr-tr for Turkey
ua-uk for Ukraine
uk-en for United Kingdom
us-en for United States
ue-es for United States (es)
ve-es for Venezuela
vn-vi for Vietnam
wt-wt for No region

DDGS和AsyncDDGS类

使用DDGS和AsyncDDGS类从DuckDuckGo.com检索搜索结果。
要使用AsyncDDGS类,您可以使用Python的asyncio库执行异步操作。
要初始化DDGS或AsyncDDGS类的实例,可以提供以下可选参数:

class DDGS:
    """DuckDuckgo_search class to get search results from duckduckgo.com

    Args:
        headers (dict, optional): Dictionary of headers for the HTTP client. Defaults to None.
        proxies (Union[dict, str], optional): Proxies for the HTTP client (can be dict or str). Defaults to None.
        timeout (int, optional): Timeout value for the HTTP client. Defaults to 10.
    """

这是初始化DDGS类的示例:

from duckduckgo_search import DDGS

with DDGS() as ddgs:
    results = [r for r in ddgs.text("python programming", max_results=5)]
    print(results)

这是初始化AsyncDDGS类的示例:

import asyncio
import logging
import sys
from itertools import chain
from random import shuffle

import requests
from duckduckgo_search import AsyncDDGS

# bypass curl-cffi NotImplementedError in windows https://curl-cffi.readthedocs.io/en/latest/faq/
if sys.platform.lower().startswith("win"):
    asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())

def get_words():
    word_site = "https://www.mit.edu/~ecprice/wordlist.10000"
    resp = requests.get(word_site)
    words = resp.text.splitlines()
    return words

async def aget_results(word):
    async with AsyncDDGS(proxies=proxies) as ddgs:
        results = [r async for r in ddgs.text(word, max_results=None)]
        return results

async def main():
    words = get_words()
    shuffle(words)
    tasks = []
    for word in words[:10]:
        tasks.append(aget_results(word))
    results = await asyncio.gather(*tasks)
    print(f"Done")
    for r in chain.from_iterable(results):
        print(r)
    

if __name__ == "__main__":
    logging.basicConfig(level=logging.DEBUG)
    asyncio.run(main())

重要的是要注意,DDGS和AsyncDDGS类应始终用作上下文管理器(with语句)。
这样可以确保正确的资源管理和清理,因为上下文管理器会自动处理打开和关闭HTTP客户端连接。

代理

代理可以指定为字典或字符串

proxies = {"http": "socks5://localhost:9150", "https": "socks5://localhost:9150"}
proxies = "socks5://localhost:9150"

1. 最简单的方法。启动Tor浏览器

from duckduckgo_search import DDGS

with DDGS(proxies="socks5://localhost:9150", timeout=20) as ddgs:
    for r in ddgs.text("something you need", max_results=50):
        print(r)

2. 使用任何代理服务器使用iproyal住宅代理的示例

from duckduckgo_search import DDGS

with DDGS(proxies="socks5://user:password@geo.iproyal.com:32325", timeout=20) as ddgs:
    for r in ddgs.text("something you need", max_results=50):
        print(r)

异常

异常:

  • DuckDuckGoSearchException:在API请求过程中发生通用异常时引发。

1. text() - Duckduckgo.com的文本搜索

def text(
    keywords: str,
    region: str = "wt-wt",
    safesearch: str = "moderate",
    timelimit: Optional[str] = None,
    backend: str = "api",
    max_results: Optional[int] = None,
) -> Iterator[Dict[str, Optional[str]]]:
    """DuckDuckGo文本搜索生成器。查询参数:https://duckduckgo.com/params

    Args:
        keywords: 查询的关键词。
        region: wt-wt, us-en, uk-en, ru-ru等。默认为"wt-wt"。
        safesearch: on, moderate, off。默认为"moderate"。
        timelimit: d, w, m, y。默认为None。
        backend: api, html, lite。默认为api。
            api - 从https://duckduckgo.com收集数据
            html - 从https://html.duckduckgo.com收集数据
            lite - 从https://lite.duckduckgo.com收集数据。
        max_results: 最大结果数。如果为None,则只返回第一个响应中的结果。默认为None。
    Yields:
        带有搜索结果的字典。

    """

示例

from duckduckgo_search import DDGS

with DDGS() as ddgs:
    for r in ddgs.text('live free or die', region='wt-wt', safesearch='off', timelimit='y', max_results=10):
        print(r)

# 搜索pdf文件
with DDGS() as ddgs:
    for r in ddgs.text('russia filetype:pdf', region='wt-wt', safesearch='off', timelimit='y', max_results=10):
        print(r)

2. answers() - Duckduckgo.com的即时答案

def answers(keywords: str) -> Iterator[Dict[str, Optional[str]]]:
    """DuckDuckGo即时答案。查询参数:https://duckduckgo.com/params

    Args:
        keywords: 查询的关键词。

    Yields:
        带有即时答案结果的字典。

    """

示例

from duckduckgo_search import DDGS

with DDGS() as ddgs:
    for r in ddgs.answers("sun"):
        print(r)

3. images() - Duckduckgo.com的图片搜索

def images(
    keywords: str,
    region: str = "wt-wt",
    safesearch: str = "moderate",
    timelimit: Optional[str] = None,
    size: Optional[str] = None,
    color: Optional[str] = None,
    type_image: Optional[str] = None,
    layout: Optional[str] = None,
    license_image: Optional[str] = None,
    max_results: Optional[int] = None,
) -> Iterator[Dict[str, Optional[str]]]:
    """DuckDuckGo图片搜索。查询参数:https://duckduckgo.com/params

    Args:
        keywords: 查询的关键词。
        region: wt-wt, us-en, uk-en, ru-ru等。默认为"wt-wt"。
        safesearch: on, moderate, off。默认为"moderate"。
        timelimit: Day, Week, Month, Year。默认为None。
        size: Small, Medium, Large, Wallpaper。默认为None。
        color: color, Monochrome, Red, Orange, Yellow, Green, Blue,
            Purple, Pink, Brown, Black, Gray, Teal, White。默认为None。
        type_image: photo, clipart, gif, transparent, line。
            Defaults to None.
        layout: Square, Tall, Wide。默认为None。
        license_image: any (All Creative Commons), Public (PublicDomain),
            Share (Free to Share and Use), ShareCommercially (Free to Share and Use Commercially),
            Modify (Free to Modify, Share, and Use), ModifyCommercially (Free to Modify, Share, and
            Use Commercially)。默认为None。
        max_results: 最大结果数。如果为None,则只返回第一个响应中的结果。默认为None。

    Yields:
        带有图片搜索结果的字典。

    """

示例

from duckduckgo_search import DDGS

with DDGS() as ddgs:
    keywords = 'butterfly'
    ddgs_images_gen = ddgs.images(
      keywords,
      region="wt-wt",
      safesearch="off",
      size=None,
      color="Monochrome",
      type_image=None,
      layout=None,
      license_image=None,
      max_results=100,
    )
    for r in ddgs_images_gen:
        print(r)

4. videos() - Duckduckgo.com的视频搜索

def videos(
    keywords: str,
    region: str = "wt-wt",
    safesearch: str = "moderate",
    timelimit: Optional[str] = None,
    resolution: Optional[str] = None,
    duration: Optional[str] = None,
    license_videos: Optional[str] = None,
    max_results: Optional[int] = None,
) -> Iterator[Dict[str, Optional[str]]]:
    """DuckDuckGo视频搜索。查询参数:https://duckduckgo.com/params

    Args:
        keywords: 查询的关键词。
        region: wt-wt, us-en, uk-en, ru-ru等。默认为"wt-wt"。
        safesearch: on, moderate, off。默认为"moderate"。
        timelimit: Day, Week, Month。默认为None。
        resolution: high, standart。默认为None。
        duration: short, medium, long。默认为None。
        license_videos: creativeCommon, youtube。默认为None。
        max_results: 最大结果数。如果为None,则只返回第一个响应中的结果。默认为None。

    Yields:
        带有视频搜索结果的字典。

    """

示例

from duckduckgo_search import DDGS

with DDGS() as ddgs:
    keywords = 'tesla'
    ddgs_videos_gen = ddgs.videos(
      keywords,
      region="wt-wt",
      safesearch="off",
      timelimit="w",
      resolution="high",
      duration="medium",
      max_results=100,
    )
    for r in ddgs_videos_gen:
        print(r)

5. news() - Duckduckgo.com的新闻搜索

def news(
    keywords: str,
    region: str = "wt-wt",
    safesearch: str = "moderate",
    timelimit: Optional[str] = None,
    max_results: Optional[int] = None,
) -> Iterator[Dict[str, Optional[str]]]:
    """DuckDuckGo新闻搜索。查询参数:https://duckduckgo.com/params

    Args:
        keywords: 查询的关键词。
        region: wt-wt, us-en, uk-en, ru-ru等。默认为"wt-wt"。
        safesearch: on, moderate, off。默认为"moderate"。
        timelimit: Day, Week, Month。默认为None。
        max_results: 最大结果数。如果为None,则只返回第一个响应中的结果。默认为None。

    Yields:
        带有新闻搜索结果的字典。

    """

示例

from duckduckgo_search import DDGS

with DDGS() as ddgs:
    keywords = 'holiday'
    ddgs_news_gen = ddgs.news(
      keywords,
      region="wt-wt",
      safesearch="off",
      timelimit="m",
      max_results=20
    )
    for r in ddgs_news_gen:
        print(r)

6. maps() - Duckduckgo.com的地图搜索

def maps(
        keywords,
        place: Optional[str] = None,
        street: Optional[str] = None,
        city: Optional[str] = None,
        county: Optional[str] = None,
        state: Optional[str] = None,
        country: Optional[str] = None,
        postalcode: Optional[str] = None,
        latitude: Optional[str] = None,
        longitude: Optional[str] = None,
        radius: int = 0,
        max_results: Optional[int] = None,
    ) -> Iterator[Dict[str, Optional[str]]]:
        """DuckDuckGo地图搜索。查询参数:https://duckduckgo.com/params

        Args:
            keywords: 查询的关键词。
            place: 如果设置了,则不使用其他参数。默认为None。
            street: 门牌号/街道。默认为None。
            city: 城市搜索。默认为None。
            county: 区县搜索。默认为None。
            state: 州搜索。默认为None。
            country: 国家搜索。默认为None。
            postalcode: 邮政编码搜索。默认为None。
            latitude: 地理坐标(南北位置)。默认为None。
            longitude: 地理坐标(东西位置);如果设置了纬度和经度,则不使用其他参数。默认为None。
            radius: 以千米为单位扩大搜索方块的距离。默认为0。
            max_results: 最大结果数。如果为None,则只返回第一个响应中的结果。默认为None。

        Yields:
            带有地图搜索结果的字典。

        """

示例

from duckduckgo_search import DDGS

with DDGS() as ddgs:
    for r in ddgs.maps("school", place="Uganda", max_results=50):
        print(r)

7. translate() - Duckduckgo.com的翻译

def translate(
    self,
    keywords: str,
    from_: Optional[str] = None,
    to: str = "en",
) -> Optional[Dict[str, Optional[str]]]:
    """DuckDuckGo翻译

    Args:
        keywords: 要翻译的字符串或字符串列表
        from_: 从什么语言翻译(自动选择默认)。默认为None。
        to: 要翻译为的语言。默认为"en"。

    Returns:
        带有翻译后关键词的字典。
    """

示例

from duckduckgo_search import DDGS

with DDGS() as ddgs:
    keywords = 'school'
    r = ddgs.translate(keywords, to="de")
    print(r)

8. suggestions() - Duckduckgo.com的建议

def suggestions(
    keywords,
    region: str = "wt-wt",
) -> Iterator[Dict[str, Optional[str]]]:
    """DuckDuckGo建议。查询参数:https://duckduckgo.com/params

    Args:
        keywords: 查询的关键词。
        region: wt-wt, us-en, uk-en, ru-ru等。默认为"wt-wt"。

    Yields:
        带有建议结果的字典。
    """

示例

from duckduckgo_search import DDGS

with DDGS() as ddgs:
    for r in ddgs.suggestions("fly"):
        print(r)
Logo

有“AI”的1024 = 2048,欢迎大家加入2048 AI社区

更多推荐