Introduction
There has been an absolutely incredible amount of activity around the evolution of LLM-based agents. Most recently Anthropic's Model Context Protocol (MCP) has seen adoption as the standard for how application context is provided to agents. My guess is we'll soon see improvements in transport layer options and authentication and authorization providing a more granular permissions model. There are already a good handful of MCP marketplaces/registries centralizing and easing access to the thousands of servers that have already been developed. MCP gateways and agent discovery mechanisms will be next up along with 'supply chain' attacks against that infrastructure. How do you verify and trust an agent or MCP server that wasn't developed in house?
All this got me thinking about how agents could discover, share and communicate information, publish capabilities and autonomously collaborate on tasks, even when they're from different agent providers. All in a standardized, easily implementable manner.
As LLM-based agents become more specialized and numerous, we need a common approach for these digital workers to find one another and collaborate.
Introducing the Agent Communication and Discovery Protocol (ACDP)
ACDP is a practical approach that leverages familiar technologies to create a secure, robust network of discoverable, interoperable AI agents.
Coincidentally, Google just released A2A as an approach to providing a solution to the problem of a common language for agents to use for communication. They lightly touch on topics of discovery and registration of capabilities too:
“...critical to support multi-agent communication by giving your agents a common language – irrespective of the framework or vendor they are built on...”
Regardless of industry or domain, this communication requirement will become commonplace. I think we'll see a fast evolution and merging of the capabilities provided by the services and companies solving this problem today
Our initial focus with ACDP was on agent discovery and establishing peer relationships for the sharing of capabilities, all through the use of existing technologies.
The full write up (it comes with more diagrams!) of the protocol can be found here. This was all a bit of a thought exercise more than anything else but thinking through the mechanics of something like this helps with reasoning about solutions in other areas of our platform. There is a simple PoC implementation of the concepts, it's somewhat contrived as it ACDP forces discovery and collaboration with other agents if a question is asked of an agent, but the idea was to show that the approach was viable.
Why We Need a Discovery Protocol for AI Agents
Let's consider a simple scenario: Imagine you have a specialized accounting agent that needs help with language translation. How does it find a translation agent? How do they exchange information securely? (I have no idea if accountants need a translation agent but looking at my tax returns, I might.) Hey, the example could be worse, it could be yet another weather agent/tool.
What will likely happen in the agentic world today (even assuming MCP is universally adopted) is that the accounting agent will do what it’s capable of and deliver output that needs to be translated separately. Or even worse, the agent will need to work on the output of translated content and restart the process again, requiring human input and guidance along the process.
Without a standard protocol, every agent integration becomes a custom project. This specification defines an approach that allows LLM-based agents to advertise themselves via DNS and discover peers in a hybrid decentralized manner. It leverages standard DNS records (TXT, SRV) for discovery and metadata, augmented by a central registry for detailed capability listing, auth requirements and dynamic updates and search. All agent-to-agent and agent-to-registry communication uses HTTPS for security and interoperability. The protocol defines how agents register their endpoints, discover each other (both through DNS and peer-to-peer awareness), describe their capabilities in a structured way, and establish secure communications. ACDP enables them to:
- Advertise their capabilities (like translation, summarization, coding, incident response, chess)
- Discover other agents with complementary skills
- Communicate securely
- Collaborate on complex tasks
The best part? There is no need to reinvent the wheel – the protocol builds on technologies that have served the internet reliably (with a few bumps along the way) for decades.
- DNS Service Discovery: DNS-based discovery mechanism leveraging DNS SRV and TXT records to publish agent endpoints and capabilities.
- Central Registry + Peer Awareness: Like how we combine Google searches with recommendations from friends, ACDP uses both a central registry (for searchability) and peer-to-peer gossip (for resilience). LLM-based agents register themselves, discover peers, and collaborate on tasks.
- HTTPS for Everything: All communication happens over standard HTTPS, ensuring compatibility with existing infrastructure and security tools.
Let's take a look at how these pieces could fit together in practice:
Agent Registration Process
When a new agent comes online, it follows a simple registration process, and it performs two key registration steps:
- DNS Registration: The agent registers SRV and TXT records in DNS, which advertise:
- The agent's hostname and port (SRV record)
- The agent's capabilities and description (TXT record)
- Registry Registration: The agent sends its full metadata to the central registry including:
- Basic information (ID, name, description)
- Detailed capabilities list
- Interface details (REST endpoints)
- Model information
- Endpoints for specific features
After registration, the agent maintains a heartbeat connection with the registry, sending periodic updates to confirm it's still active. If the registry loses track of the agent, the agent will automatically re-register during its next heartbeat attempt.

Agent Discovery Process
The Discovery Service component provides a unified way to discover other agents through multiple methods:
- Registry-based Discovery: Agents can query the central registry to:
- Find a specific agent by ID
- Discover agents with specific capabilities
- Search for agents matching multiple criteria
- DNS-based Discovery: Agents can resolve other agents using DNS queries:
- SRV lookup to find host and port
- TXT lookup to get capabilities and description
- Cache-based Optimization: The Discovery Service maintains a local cache of discovered agents to reduce network traffic and improve performance.
The implementation follows a fallback pattern - first check the cache, then try the registry, and finally fall back to DNS if needed.
def find_agent():
# Try registry first
agents = registry_client.search(capability="translation")
# If none found, try DNS lookup
if not agents:
srv_records = dns_resolver.query("_llm-agent._tcp.translator.agents.example.com", "SRV")
txt_records = dns_resolver.query("_llm-agent._tcp.translator.agents.example.com", "TXT")
# Parse records...
# If still nothing, ask peers
if not agents:
agents = ask_peers_for_agents()
return agents
When our accountant's AI agent needs to find an agent with the appropriate capabilities, it can:
- Query the registry: "Show me agents with translation capability"
- Directly look up DNS:
_llm-agent._tcp.translator.agents.example.com
- Ask a peer agent: "Do you know any good translation agents?"
This hybrid approach means if one discovery method fails, others can pick up the slack. These discovery, communication and collaboration capabilities enable agents’ abilities to be shared, improving all agents. So, the accounting agent will find the translation support it needs and possibly collaborate with other agents to help them finish their work efficiently.
A discovery request would look like this:

Peer-to-Peer Awareness
The Peer Manager component implements a decentralized peer discovery mechanism through "gossiping":
- Gossip Protocol: At regular intervals, each agent:
- Selects a random subset of known peers
- Exchanges peer lists with them
- Discovers new peers through these exchanges
- Health Checking: Agents maintain information about peer health:
- They periodically check if peers are responsive
- They update peer status (healthy, unhealthy, unknown)
- They remove stale peers that haven't been seen for too long
- Peer Exchange: When two agents gossip, they:
- Send a list of their known peers to each other
- Process any new peers received
- Resolve details of new peers via the Discovery Service
This peer-to-peer approach ensures agents can discover new peers even if the central registry is unavailable, creating a resilient network.
Agent Collaboration
The implementation includes a collaboration mechanism where agents can work together to answer questions:
- Assistance Requests: When an agent receives a question, it can:
- Select relevant peer agents based on capabilities
- Send assistance requests to those peers
- Gather responses from peers in parallel
- Collaborative Responses: The primary agent:
- Incorporates peer responses into its own context
- Generates a final response that synthesizes the collective knowledge
- Credits peer agents in the response
- Agent Selection: Agents prioritize peers for collaboration based on:
- Relevance of capabilities to the query
- Health status (preferring responsive peers)
- Recent activity
This collaborative approach allows the system to leverage specialized knowledge across different agents.
Since DNS is somewhat key to the discovery mechanism, we should look at some actual DNS records for an agent:
; SRV record defines where to find the agent
_llm-agent._tcp.translator.agents.example.com. IN SRV 10 10 8000 translator-1.example.com.
; TXT record describes the agent's capabilities
_llm-agent._tcp.translator.agents.example.com. IN TXT "ver=1.0"
"caps=translation,summarization" "desc=Polyglot translator specialized in technical content for accountants"
; A record resolves the hostname to an IP
translator-1.example.com. IN A 192.168.1.100
These records tell us:
- The agent runs on host translator-1.example.com at port 8000
- It has translation and summarization capabilities
- It specializes in technical content (for accountants)
This structure can be queried with standard DNS tools. For example, to find the translator's address:
$ dig _llm-agent._tcp.translator.agents.example.com SRV
Security Considerations
1. DNS Security: DNSSEC provides cryptographic verification of DNS records, preventing spoofing.
2. Authentication Layers:
- Registry updates require authentication (API keys or digital signatures)
- Agent-to-agent communication can use mutual TLS for bidirectional verification
- Access controls determine which agents can see or interact with others
- Oauth2 should be easily implementable
3. Tiered Trust Model:
- DNS provides baseline trust (domain ownership verification)
- TLS certificates verify identity
- Peer reputation tracks trustworthiness over time
4. Private Deployments:
- Private Registries: Organizations can run their own registry servers internally.
- Split DNS: Different DNS responses can be provided for internal vs. external queries.
- Controlled Discovery: Access to registry and DNS can be restricted to authorized clients.
- Security Measures: Authentication, authorization, and encryption can be applied to all communications.
These features make the system suitable for enterprise, government, healthcare, and financial deployments where privacy and security are paramount.
For example, if Agent A queries Agent B for help, it first verifies Agent B's identity through TLS, checks if Agent B has the necessary capabilities, and may require additional authentication before sharing sensitive data.
Private Deployments: Keep Your Agents to Yourself
For enterprises, healthcare providers, and government institutions, ACDP offers additional privacy controls:
Private Registry: Run your own internal registry that only indexes authorized agents:
AgentRegistry.internal.company.com
├── hr-assistant (capability: hr-policies)
├── code-reviewer (capability: code-review)
└── security-scanner (capability: vulnerability-detection)
Split Horizon DNS: Different DNS responses for internal vs. external queries:
; Internal DNS sees this
_llm-agent._tcp.hr-bot.company.internal. IN SRV 0 0 8000 hr-bot.company.internal.
; External DNS query returns nothing or an error
Network Isolation: Keep agents on a private network, accessible only via VPN or internal access points.
Enterprise AI Ecosystem: A large corporation deploys several specialized agents on their internal network:
- An HR Virtual Assistant resides on the corporate intranet with the DNS entry _llm-agent._tcp.hr-bot.company.internal. Employees can discover it through the internal agent directory, but external parties have no visibility into this agent. It handles onboarding questions, benefits inquiries, and policy clarifications, while keeping sensitive HR data within company boundaries.
- An IT Support Agent registered as it-assist.corp.internal helps staff troubleshoot technical issues and automate support tickets. The split DNS configuration ensures only devices on the corporate network (or VPN) can find the agent's address, speeding up internal support while preventing outsiders from accessing or even knowing about the support bot.
Intelligence agencies could implement ACDP within classified networks for secure agent collaboration:
; Example private DNS zone in a classified network
_llm-agent._tcp.intel-analyst.siprnet.gov. IN SRV 0 0 443 analyst-agent.classified.gov.
_llm-agent._tcp.intel-analyst.siprnet.gov. IN TXT "caps=intel-analysis,pattern-recognition" "desc=Classified intelligence analysis assistant"
- Intelligence Processing Agents that analyze satellite imagery, intelligence reports, and signal intercepts likely operate exclusively on networks like SIPRNet. For example, a Defense Intelligence Agency analytical AI might be registered in a private directory so that analysts' workstations on the classified network can discover the "Intel Analyst AI" service, while the agent remains invisible outside that secure network.
Public Agent Ecosystems: Agents for Everyone
For public agent ecosystems, ACDP enables open discovery while maintaining security:
Public Registry: A searchable directory of available agents and their capabilities, similar to an app store Verified Domains: DNS-based verification ensures agents come from legitimate sources. Capability Matching: Find the right agent for any task based on its advertised capabilities:
$ curl https://registry.example.com/agents?capability=translation
{
"agents": [
{
"id": "translator.agents.example.com",
"name": "Polyglot",
"capabilities": ["translation",
"summarization"],
"description": "Specialized in technical content for..."
}
]
}
Imagine a researcher working on climate modeling who needs specialized analysis:
- Their primary research assistant agent discovers three specialized agents through the registry:
- climate-data.science-agents.org (data analysis)
- viz-specialist.science-agents.org (visualization)
- paper-formatter.science-agents.org (academic formatting)
- The agents collaborate through a chain of specialized tasks
- Each agent registers its expertise publicly while maintaining security through authentication:
_llm-agent._tcp.climate-data.science-agents.org. IN TXT "caps=data-analysis,time-series,climate" "auth=required"
Extending Capabilities with Tool Discovery
ACDP can do more than connect agents – it also helps them discover tools and data sources through integration with the Model Context Protocol (MCP).
The agent can then connect to this server, list available tools, and access them through a standardized interface
This integration allows agents to:
- Discover external tools
- Access structured data sources
- Find reusable prompt templates
For example, a hospital implements ACDP with MCP to safely integrate AI agents with electronic health record systems:
- The hospital sets up an MCP server that provides access to patient data as read-only resources:
_mcp._tcp.health-tools.hospital.local. IN SRV 0 0 443 ehr-gateway.hospital.local.
_mcp._tcp.health-tools.hospital.local. IN TXT "resources=labs,imaging,notes" "auth=oauth2"
- A doctor's AI assistant discovers this MCP server through DNS and the hospital's private registry.
- When the doctor asks: "Summarize this patient's recent lab results," the assistant:
- Authenticates with the MCP server (enforcing HIPAA compliance)
- Uses resources/read to retrieve authorized lab data
- Generates the summary using only data the doctor is authorized to access
- The entire interaction happens within the hospital's secure environment, with the MCP server enforcing role-based access controls
Sample Security Operations Use Case: Integrated Threat Detection and Response
Consider a financial institution's Security Operations Center (SOC):
Private Security Infrastructure: The bank maintains several specialized security agents behind their firewall:
; Internal DNS records for security agents
_llm-agent._tcp.responder.bank-sec.internal. IN SRV 0 0 443 ir.bank-sec.internal.
_llm-agent._tcp.reponder.bank-sec.internal. IN TXT "caps=log-analysis,threat-hunting,incident-response" "desc=Advanced threat detection and investigation agent"
_llm-agent._tcp.forensics.bank-sec.internal. IN SRV 0 0 443 forensics.bank-sec.internal.
_llm-agent._tcp.forensics.bank-sec.internal. IN TXT "caps=memory-forensics,disk-forensics,network-forensics" "desc=Digital forensics assistant"
Bridging to Public Intelligence: When investigating a suspicious alert, these private agents leverage public threat intelligence through authenticated ACDP connections:
; Public DNS records for threat intelligence agents
_llm-agent._tcp.threat-intel.security-alliance.org. IN SRV 0 0 443 osint.security-alliance.org.
_llm-agent._tcp.threat-intel.security-alliance.org. IN TXT "caps=ioc-lookup,threat-actor-profiles,vulnerability-data" "auth=required"
Here's how this hybrid approach might handle a potential security incident:
- An notification is generated by their Identity Management platform and alerts the Incident Response Agent
- The agent:
- Analyzes internal logs
- Discovers/Extracts indicators (IP addresses, UA Strings, UPNs, Emails, Domains, etc...)
- Needs additional context on these indicators
- Using ACDP, it discovers and authenticates to public threat intelligence agents
- Based on the combined internal and external intelligence, it may:
- Collaborate with the Forensics Agent for deeper investigation
- Generate automated response recommendations
- Create a detailed report for the security team
The system also leverages MCP for discovering and using enterprise security tools, including integration with common security platforms:
; MCP server for security operations tools
_mcp._tcp.security-tools.bank-sec.internal. IN SRV 0 0 443 security-ops.bank-sec.internal.
_mcp._tcp.security-tools.bank-sec.internal. IN TXT "tools=firewall-control,endpoint-isolation,malware-scan" "auth=mTLS"
; MCP server for Microsoft security tools integration
_mcp._tcp.ms-security.bank-sec.internal. IN SRV 0 0 443 security-tools.bank-sec.internal.
_mcp._tcp.ms-security.bank-sec.internal. IN TXT "tools=ms-graph,defender-endpoint,sentinel" "auth=mTLS"
The agents can:
- Discover security tools via the MCP server:
// Request to list available tools
{
"method": "tools/list",
"params": {"category": "response"},
"id": 1
}
- Execute authorized actions through these tools:
// Request to isolate compromised endpoint
{
"method": "tools/call",
"params": {
"tool": "endpoint-isolation",
"arguments": {
"hostname": "compromised-workstation",
"reason": "Suspected malware infection",
"duration": "until_manual_review"
}
},
"id": 2
}
The ability to maintain security boundaries while enabling authenticated cross-boundary collaboration makes ACDP particularly valuable in cybersecurity contexts, where both protecting internal systems and leveraging external intelligence are critical.
For example, the Forensics Agent might discover that it can query Windows Event Logs but not perform active responses, while the Incident Response Agent has authorization to isolate endpoints when suspicious activity passes a certain threshold.
The beauty of this approach is that new security tools can be added to the MCP server without modifying the agents themselves. If the bank adds a new security tool, it simply registers the tool, and all authorized agents can immediately discover and leverage its capabilities.
Conclusion
By leveraging familiar technologies while adding agent-specific discovery mechanisms, we're creating a foundation that's both robust and accessible. If this protocol were expanded and improved upon, I’d envision;
- Standard capability taxonomies for consistent agent classification
- Trust networks that help identify reliable collaborators
- Additional security layers for the various domains and use cases
- Specialized discovery mechanisms for different domains (cybersecurity, healthcare, finance, etc.)
- Enhanced collaboration/communication patterns beyond simple request-response. See Google's A2E
- Shared History/Memory management capabilities
- Rate limiting and usage policies especially for open-access agents. Each agent’s metadata can include rate limits (e.g. “max 100 requests per hour from a single IP”) or licensing terms
I don't know that this approach makes sense in closed ecosystems or for single platforms running agents but as inter communication and requirements for agent discovery become commonplace, we'll benefit from having an easily implementable and scalable approach whether it's this or something entirely different.
Regardless of whether this approach makes sense to you or not, if you're building an ecosystem of enterprise agents or contributing to an open network of public agents, it's worth thinking about this up front.
Take a look at the full write up of the protocol and sample code on GitHub. I’d be curious to hear your thoughts on it.