IA4PYMES es una agencia especializada en automatización de procesos para PYMES mediante Inteligencia Artificial. Desarrollamos chatbots, automatizamos tareas repetitivas y creamos herramientas de IA personalizadas para cada negocio, con un ROI medio del +360%.

¿Cuánto cuesta automatizar mi negocio con IA?

El coste depende del proyecto específico. Ofrecemos una consulta gratuita de 30 minutos para analizar tus necesidades y darte un presupuesto personalizado sin compromiso. Antes de desarrollar nada, calculamos el ROI esperado: si los números no te benefician, no avanzamos.

¿Qué tipo de empresas pueden beneficiarse de vuestros servicios?

Cualquier PYME que quiera reducir tiempo en tareas repetitivas, mejorar la atención al cliente con chatbots, o automatizar procesos internos. Trabajamos con empresas de todos los sectores en España: comercio, logística, servicios profesionales, hostelería, inmobiliaria y más.

¿Cuánto tiempo tarda en implementarse una solución de IA?

Un chatbot básico puede estar listo en 2-3 semanas. Los proyectos de automatización de procesos suelen tardar entre 1 y 4 meses. Siempre trabajamos de forma colaborativa y con seguimiento continuo.

¿Necesito conocimientos técnicos para usar vuestras soluciones de IA?

No. Nuestras soluciones están diseñadas para que cualquier persona las use sin formación técnica. Nos encargamos de toda la implementación y formamos a tu equipo paso a paso.

¿Qué diferencia a IA4PYMES de otras agencias de IA?

Nos especializamos exclusivamente en PYMES españolas. No ofrecemos soluciones genéricas: cada proyecto se construye desde cero para tu negocio concreto. Además, solo iniciamos el desarrollo si el ROI calculado es favorable para ti.

¿Es seguro para mis datos trabajar con IA4PYMES?

Sí. Cumplimos con el RGPD, firmamos un acuerdo de confidencialidad y tus datos jamás se usan para entrenar modelos de IA públicos.

¿Puéis automatizar la atención al cliente de mi empresa?

Sí, es uno de nuestros casos de uso más frecuentes. Desarrollamos chatbots y agentes de IA que responden a clientes 24/7 por WhatsApp, web o email, reduciendo el tiempo de respuesta y liberando a tu equipo para tareas de mayor valor.

Sovereignty and Cost Control in Agentic Development: How to Run Codex Desktop with Local and Alternative Models using codex-shim

Agentic programming is no longer a laboratory experiment; it has become the productivity engine of modern software development teams. Tools like Codex Desktop — OpenAI's official application designed to run coding agents in parallel, manage code branches via worktree support, and automate testing — represent the state of the art in this field.

However, for tech SMEs and software consultancies, adopting these tools introduces three critical challenges:

Skyrocketing API Costs: Autonomous agents operate in a loop (planning, writing, compiling, testing, and debugging). This workflow consumes millions of tokens in a matter of hours. Running premium models like GPT-4o through the official API can inflate monthly bills to thousands of dollars.
Vendor Lock-in: Remaining tied exclusively to OpenAI's models and uptime limits your flexibility to exploit external innovations.
Data Privacy and GDPR Compliance Risks: Sending proprietary source code or confidential client repositories to external servers in the US can violate non-disclosure agreements (NDAs) and European data sovereignty regulations.

To address these challenges and reclaim control of your agentic workspace, the open-source community developed codex-shim (created by Sybil Solutions / 0xSero). In this guide, we analyze what this tool is, how to set it up step-by-step, and how it can slash your development costs by 95% while keeping your intellectual property safe.

What is codex-shim and How Does It Work?

codex-shim is a lightweight, local Python (aiohttp) proxy server that acts as a translation layer compatible with the OpenAI API.

Instead of Codex Desktop sending requests directly to OpenAI's cloud servers, you configure the application to point to your local codex-shim server (e.g., http://127.0.0.1:38440/v1).

When Codex Desktop makes an agentic query or executes a file modification, the traffic flows as follows:

Interception: The shim intercepts the incoming HTTP requests from the Codex Desktop client.
Translation & Mapping: The shim translates the prompt structure, system messages, and tool call schemas (which Codex uses to interact with the shell or run web searches) into the exact formats expected by your chosen upstream provider (such as Anthropic, DeepSeek, OpenRouter, or local backends).
Upstream Request: The translated request is forwarded to the configured AI backend.
Response Translation: The shim receives the response (including streaming tokens and tool calls) and translates it back into the exact schema Codex Desktop expects, preventing tool execution from failing.

This translation happens locally in milliseconds, allowing Codex Desktop to work seamlessly with alternative models without requiring any modification of the compiled OpenAI application.

Step-by-Step Installation & Setup

Here is the technical guide to deploy codex-shim on your local development environments.

1. Clone the Repository and Install Dependencies

Ensure you have Python 3.11+ installed on your machine.

For macOS / Linux / WSL / Git Bash:

git clone https://github.com/0xSero/codex-shim ~/codex-shim
cd ~/codex-shim
python3 -m pip install --user -e .

For Native Windows (PowerShell):

git clone https://github.com/0xSero/codex-shim $HOME\codex-shim
cd $HOME\codex-shim
py -3.11 -m pip install --user -e .

This installs codex-shim as an executable CLI utility in your local user path.

2. Configure Your Upstream Models

The routing logic and API keys are defined in a JSON file called models.json.

The CLI tool looks for this file at:

macOS / Linux / WSL: ~/.codex-shim/models.json
Native Windows: C:\Users\<YourUsername>\.codex-shim\models.json

Create the directory and the file. In this example, we configure a cheap cloud API (DeepSeek), a local open-source backend (Ollama), and a premium model (Anthropic):

{
  "models": [
    {
      "slug": "deepseek-coder",
      "provider": "openai",
      "base_url": "https://api.deepseek.com/v1",
      "api_key": "sk-your-deepseek-api-key"
    },
    {
      "slug": "local-llama3",
      "provider": "openai",
      "base_url": "http://127.0.0.1:11434/v1",
      "api_key": "ollama"
    },
    {
      "slug": "claude-sonnet",
      "provider": "anthropic",
      "base_url": "https://api.anthropic.com/v1",
      "api_key": "sk-ant-your-anthropic-key"
    }
  ],
  "router": {
    "enabled": true,
    "fallback_model": "deepseek-coder"
  }
}

3. Connect Codex Desktop to the Shim

To redirect Codex Desktop requests to the local proxy, we need to update the application's global settings in ~/.codex/config.toml.

The CLI makes this simple. In your terminal, run:

# Generate the Codex-compatible model catalog from your models.json
codex-shim generate

# Set your active default model
codex-shim model use deepseek-coder

This automatically updates your config.toml file, routing the Codex base URL to http://127.0.0.1:38440/v1 and configuring the matching model identifiers.

4. Launch the Local Proxy Daemon

Start the background proxy server:

codex-shim start

Verify that the proxy is active and listing your models:

codex-shim list

Now, when you open Codex Desktop, the agentic harness will execute code modifications, terminal tasks, and searches using your chosen backend dynamically and transparently.

Competitive Advantages for Tech SMEs

Integrating an independent agentic development layer using codex-shim provides massive strategic value for B2B software engineering teams:

95% API Cost Reduction

OpenAI's GPT-4o costs roughly $5.00 per million input tokens and $15.00 per million output tokens. For autonomous agents that continually read, write, and debug code, token usage adds up fast. By routing Codex Desktop to DeepSeek-Coder-V2 via the shim, the cost drops to $0.14 per million input tokens and $0.28 per million output tokens. This represents a cost reduction of over 95%, making it economically feasible to equip your entire development team with active agentic coding workspaces.

Absolute Data Sovereignty (GDPR & NDA Compliance)

By routing the proxy to a local Ollama server or a private vLLM cluster running open-source models (such as Llama 3 70B or a local DeepSeek instance), no source code leaves your corporate network. This guarantees absolute compliance with European GDPR regulations and satisfies the strict intellectual property requirements of corporate clients.

Model Flexibility & Best-of-Breed Tooling

Development teams are no longer locked into a single provider. You can leverage Claude 3.5 Sonnet (widely considered the industry benchmark for complex refactoring and logical coding tasks) and switch instantly to low-cost or local models for simple unit testing or boilerplate generation.

Smart Auto-Router for Cost Control

The shim's built-in router (codex-auto) uses a local classifier model to evaluate prompt complexity:

Simple requests (e.g., "add code comments to this file") are routed to your free local LLM or the cheapest provider.
Complex tasks requiring multi-file context and logical reasoning are automatically escalated to Claude 3.5 Sonnet or GPT-4o. This ensures optimal resource allocation without requiring developers to change settings manually.

Conclusion

The Codex Desktop application is one of the most powerful agentic development tools available, but relying solely on OpenAI's APIs limits its cost-effectiveness and compliance in enterprise environments. Deploying a smart proxy like codex-shim allows SMEs to merge the best of both worlds: a world-class agentic user interface and the cost savings, sovereignty, and choice of open-source and alternative AI models.

🛠️ Ready to deploy a secure, private agentic development workspace in your company?

At IA4PYMES, we help software companies and IT departments set up private LLM servers, configure developer proxies like codex-shim, and define code governance guidelines that ensure GDPR compliance and maximize developer output.

Book a free 15-minute technical consultation with our engineering team today and let's build your custom private AI development stack.