Skip to content

feat(peopledatalabs): add People Data Labs integration#4513

Merged
waleedlatif1 merged 13 commits intostagingfrom
waleedlatif1/pdl-integration
May 8, 2026
Merged

feat(peopledatalabs): add People Data Labs integration#4513
waleedlatif1 merged 13 commits intostagingfrom
waleedlatif1/pdl-integration

Conversation

@waleedlatif1
Copy link
Copy Markdown
Collaborator

@waleedlatif1 waleedlatif1 commented May 8, 2026

Summary

  • Adds the People Data Labs integration to Sim with 11 operations: person enrich/identify/search/bulk, company enrich/search/bulk/clean, location and school cleaners, and field autocomplete.
  • API-key auth via X-Api-Key. New block uses AuthMode.ApiKey, brand color #4831C3, and the official PDL icon.
  • Every tool was cross-validated against PDL's official API docs: scroll_token pagination (not from), top-level likelihood on company enrich, per-item likelihood on bulk company, full autocomplete field enum (location_name over deprecated location), correct dataset values, cleaners as POST, and 404-as-no-match handling.

Test plan

  • Person Enrich by email returns matched record + likelihood
  • Person Identify returns up to 20 candidate matches with match_score
  • Person Search via SQL returns results + scroll_token for pagination
  • Bulk Person Enrich processes a JSON array and echoes per-item metadata
  • Company Enrich by website returns top-level likelihood
  • Company Search returns total + scroll_token
  • Bulk Company Enrich returns per-item likelihood
  • Company / Location / School Cleaner POSTs return canonical records (or matched: false on 404)
  • Autocomplete returns suggestions with meta payload for company field

Add 11 PDL operations: person enrich/identify/search/bulk, company
enrich/search/bulk/clean, location/school cleaners, and autocomplete.
All endpoints, params, and response shapes verified against official
PDL docs (scroll_token pagination, top-level likelihood on company
enrich, per-item likelihood on bulk company, full autocomplete field
enum).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 8, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped May 8, 2026 5:53pm

Request Review

@cursor
Copy link
Copy Markdown

cursor Bot commented May 8, 2026

PR Summary

Medium Risk
Adds a new third-party integration with 11 API-backed tools and new parameter-mapping logic, which could affect request correctness and error handling. Risk is moderated by being additive and scoped to a new peopledatalabs type.

Overview
Adds a new People Data Labs integration end-to-end: new block configuration (API-key auth, operation selector, and parameter normalization to prevent stale UI values leaking across operations) plus registration in the block registry.

Introduces 11 new PDL tools (person/company enrich, search, identify, bulk enrich, cleaners, autocomplete) with typed request/response shaping, consistent X-Api-Key auth, and 404-as-no-match handling, and wires them into the global tool registry.

Updates docs and landing/integrations metadata to list the new tool, adds a PeopleDataLabsIcon, and maps the peopledatalabs type to that icon across docs and the integrations page.

Reviewed by Cursor Bugbot for commit 80f1279. Configure here.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 8, 2026

Greptile Summary

This PR adds a complete People Data Labs integration with 11 operations spanning person and company enrichment, search, bulk processing, cleaner utilities, and field autocomplete. The implementation is thorough: API-key auth flows correctly through user-only visibility, 404 responses are handled as no-match rather than errors, scroll_token pagination is used correctly, and the multi-operation params remapping function carefully resets shared fields before repopulating them per-operation to prevent stale-value leakage — all hardened across several prior review rounds.

  • 11 new tools registered in both the tool and block registries, each with typed params, transformResponse, and projected output schemas.
  • Multi-operation block with a careful alias-reset strategy in tools.config.params that prevents cross-operation field leakage confirmed clean after prior fixes.
  • Minor gaps remaining: the name input description is ambiguous (person name vs company name), and the cleaner tools send an empty POST body without error when no identifier field is provided.

Confidence Score: 5/5

Safe to merge — all tool-to-block parameter mappings are correct, auth is properly scoped, and the cross-operation field-reset logic is solid after prior iterations.

The integration is a clean additive change with no modifications to existing tools. All prior review findings have been addressed. The two remaining items are a misleading description on a dual-purpose input and missing pre-flight validation in cleaner tools — neither causes silent wrong results or data loss.

No files require special attention. The company_clean.ts and school_clean.ts body builders are worth a second look for the missing empty-input guard, but this is non-blocking.

Important Files Changed

Filename Overview
apps/sim/blocks/blocks/peopledatalabs.ts New block defining 11 PDL operations; params remapping is thorough after prior fixes.
apps/sim/tools/peopledatalabs/utils.ts buildQueryString and projection helpers look correct.
apps/sim/tools/peopledatalabs/person_enrich.ts Adds name param to both PdlPersonEnrichParams and buildQueryString call, addressing prior review findings.
apps/sim/tools/peopledatalabs/person_identify.ts Full identify params forwarded correctly in query string.
apps/sim/tools/peopledatalabs/company_enrich.ts Reads top-level likelihood from response as documented; correctly handles 404 as no-match.
apps/sim/tools/peopledatalabs/bulk_person_enrich.ts JSON-parses and validates the requests array; maps per-item likelihood and metadata correctly.
apps/sim/tools/peopledatalabs/bulk_company_enrich.ts Same pattern as bulk person enrich; correctly reads per-item likelihood at item level.
apps/sim/tools/peopledatalabs/company_clean.ts POST body builder correctly only sends provided fields; 404 handled as no-match.
apps/sim/tools/peopledatalabs/types.ts Comprehensive type definitions for all PDL request/response shapes.
apps/sim/tools/peopledatalabs/person_search.ts Uses scroll_token for pagination correctly.
apps/sim/tools/peopledatalabs/company_search.ts Same pattern as person search; correctly returns scroll_token for pagination.

Reviews (11): Last reviewed commit: "fix(peopledatalabs): isolate company `na..." | Re-trigger Greptile

Comment thread apps/sim/blocks/blocks/peopledatalabs.ts
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptileai

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/blocks/blocks/peopledatalabs.ts
Comment thread apps/sim/blocks/blocks/peopledatalabs.ts
Comment thread apps/sim/blocks/blocks/peopledatalabs.ts
Comment thread apps/sim/blocks/blocks/peopledatalabs.ts
…peration

- min_likelihood now only shows for pdl_person_enrich (Person Identify ignores it)
- ticker, pdl_id, company_location now only show for pdl_company_enrich
  (Company Cleaner only accepts name/website/profile)

Addresses Greptile P1 review on PR #4513.
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Param renames (company_profile→profile, company_location→location,
school_*→*, bulk_*_requests→requests, autocomplete_size→size, etc.)
now run only when the matching operation is selected, and stale
alternate-operation values are stripped from the request. This
prevents values left over from a prior operation switch from leaking
into the current API call (e.g. a company LinkedIn URL overwriting
a person profile, or a stale search size overwriting autocomplete
size).

Addresses Cursor Bugbot review on PR #4513.
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/components/icons.tsx Outdated
The PeopleDataLabsIcon was hardcoded to white, leaving it invisible
on light backgrounds when rendered outside its bgColor container
(e.g., search results, menus, docs). Switch to currentColor so it
inherits the surrounding text color.

Addresses Cursor Bugbot review on PR #4513.
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/blocks/blocks/peopledatalabs.ts Outdated
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit f2a3c9f. Configure here.

…te) per operation

The block has subBlocks whose raw IDs collide with PDL API param names
(profile, location for person; name, website for company). Their values
persist across operation switches even though the UI hides them, so a
person LinkedIn URL could leak into a Company Enrich request, etc.
Reset these shared targets and repopulate them only from inputs that
belong to the active operation.

Addresses Greptile P1 review on PR #4513.
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

Comment thread apps/sim/blocks/blocks/peopledatalabs.ts Outdated
Comment thread apps/docs/components/icons.tsx Outdated
- person_identify: short-circuit on PDL 404 (no-match), matching
  the person_enrich pattern
- company_search: drop unsupported `dataset` param (PDL company
  search docs do not list it)
- block: expose `min_likelihood` for `pdl_company_enrich` (PDL
  Company Enrichment supports min_likelihood)
- location_clean: surface `subregion`; drop phantom `latitude`/
  `longitude` (PDL only returns `geo` as a "lat,lon" string)
- school_clean: surface `domain` and `location_continent` from
  the nested `location` object
- docs icon: switch fill to `currentColor` so the icon renders
  on light backgrounds
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/blocks/blocks/peopledatalabs.ts
The shared `name` reset at the top of `tools.config.params` was
only repopulated for the company-side operations, so any
programmatic `name` input to `pdl_person_enrich` or
`pdl_person_identify` was silently dropped. Both PDL endpoints
accept `name` as a full-name match parameter.
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 092dfba. Configure here.

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/tools/peopledatalabs/person_enrich.ts
Comment thread apps/sim/tools/peopledatalabs/person_enrich.ts
Comment thread apps/sim/tools/peopledatalabs/types.ts
Comment thread apps/sim/tools/peopledatalabs/person_enrich.ts
Add the `name` parameter to `PdlPersonEnrichParams`, the tool's
params definition, and the URL builder. PDL Person Enrichment
accepts `name` as a full-name match alternative to first_name +
last_name; without it, programmatic `name` input was silently
dropped before reaching the API.
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/blocks/blocks/peopledatalabs.ts
Rename Company Name subBlock id from `name` to `company_name` so a
stale company value can't leak into Person Enrich/Identify when the
user switches operations.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 80f1279. Configure here.

Comment thread apps/sim/blocks/blocks/peopledatalabs.ts
`pdl_clean_location` and `pdl_clean_school` were only restoring values
from UI subBlock IDs (`clean_location_input`, `school_*`). Programmatic
callers using the declared `location`/`name`/`website`/`profile` inputs
had their values dropped after the shared-field reset. Add fallbacks so
both UI and programmatic inputs flow through.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…lete text

- Company Enrich and Clean Company now fall back to programmatic
  `params.profile` / `params.location` when the UI-scoped
  `company_profile` / `company_location` are absent. Mirrors the
  fallback pattern already used for `name`.
- Autocomplete `text` subBlock is now required when operation is
  autocomplete — PDL requires it for nearly all field values.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@waleedlatif1 waleedlatif1 merged commit 9e9ddaa into staging May 8, 2026
13 checks passed
@waleedlatif1 waleedlatif1 deleted the waleedlatif1/pdl-integration branch May 8, 2026 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant