需要帮助！-js教程-PHP中文网

需要帮助！

PHPz

发布： 2024-08-16 18:05:40

转载

1053人浏览过

需要帮助！

AiTxt 文案助手

AiTxt 利用 Ai 帮助你生成您想要的一切文案，提升你的工作效率。

查看详情

嗨，我需要精通网络抓取的人的帮助，因为我是编程新手。我的任务是从工作链接中提取“关于客户”部分。我的脚本仅提取一个“关于客户端”，但对于其他链接，它不会执行此操作并引发错误。问题是有一个 xml 文件链接，我从中提取作业链接，当这些链接打开时，html 代码位于我使用 selenium 的 java 脚本下。我已经尝试了一切，但没有得到解决方案。`def extract_client_info(job_url):
client_info = {'关于客户': np.nan}

if job_url and job_url != "N/A":
    try:
        # Open the job URL
        driver.get(job_url)

        # Wait for the page to load
        WebDriverWait(driver, 30).until(
            EC.presence_of_element_located((By.CSS_SELECTOR, '.cfe-about-client-v2'))
        )

        # Extract specific details
        about_client_section = driver.find_element(By.CSS_SELECTOR, '.cfe-about-client-v2')
        client_location = about_client_section.find_element(By.CSS_SELECTOR, '[data-qa="client-location"]').text.strip()
        client_job_posting_stats = about_client_section.find_element(By.CSS_SELECTOR, '[data-qa="client-job-posting-stats"]').text.strip() if about_client_section.find_elements(By.CSS_SELECTOR, '[data-qa="client-job-posting-stats"]') else "N/A"
        client_company_profile = about_client_section.find_element(By.CSS_SELECTOR, '[data-qa="client-company-profile"]').text.strip()

        # Combine extracted information
        client_info['About the Client'] = (
            f"Location: {client_location}\n"
            f"Job Posting Stats: {client_job_posting_stats}\n"
            f"Company Profile: {client_company_profile}"
        )

    except Exception as e:
        print(f"Failed to get 'About the Client' for {job_url}: {e}")
        client_info['About the Client'] = np.nan

    finally:
        # Wait for 10 seconds before making the next request
        time.sleep(10)

return client_info`

登录后复制

以上就是需要帮助！的详细内容，更多请关注php中文网其它相关文章！