Synthetic P-wave velocity model generation using LLMs for seismic data simulation

Marcelo Guarido, Ivan Sanchez, Kristopher A. Innanen

In this work, we explored the use of large language models (LLMs) and multimodal models to generate synthetic P-wave velocity models for seismic data simulation. We tested GPT-5 and GPT-5-mini combined with extensive prompt engineering for generating velocity models based on textual descriptions, but the outputs were often unrealistic and inconsistent. Improvement attempts using few-shot prompts with examples of complex velocity models did not yield satisfactory results. Maybe using more examples in the few-shot prompt or providing a better-designed system prompt could improve the results, but due to token limitations and resource constraints, we were unable to explore these options further. On the other hand, DALL-E 2 showed promise in generating variations of velocity models based on an initial image and text prompts. It allows the creation of multiple images at once, providing a range of velocity models that can be used for seismic data simulation. However, the generated models lacked a direct mapping to velocity values and exhibited varying degrees of geological plausibility.