<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Philip Conrod, Author at Philip Conrod</title>
	<atom:link href="https://www.philipconrod.com/author/philip/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.philipconrod.com/author/philip/</link>
	<description>We Provide Information Technology &#38; Cybersecurity Curriculum, Training &#38; Consulting Services for Businesses, School Districts and Non-Profit Organizations</description>
	<lastBuildDate>Thu, 02 Apr 2026 01:56:48 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://www.philipconrod.com/wp-content/uploads/2025/02/cropped-Philip-Conrod-32x32.jpg</url>
	<title>Philip Conrod, Author at Philip Conrod</title>
	<link>https://www.philipconrod.com/author/philip/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Co-Developing Unreal Engine Video Games using Local Ollama LLM Models with the Latest v1.0.66 Beta Version of the Betide NeoStack AI Unreal Plugin</title>
		<link>https://www.philipconrod.com/co-developing-unreal-engine-video-games-using-local-ollama-llm-models-with-the-latest-v1-0-66-beta-version-of-the-betide-neostack-ai-unreal-plugin/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=co-developing-unreal-engine-video-games-using-local-ollama-llm-models-with-the-latest-v1-0-66-beta-version-of-the-betide-neostack-ai-unreal-plugin</link>
		
		<dc:creator><![CDATA[Philip Conrod]]></dc:creator>
		<pubDate>Wed, 01 Apr 2026 18:00:27 +0000</pubDate>
				<category><![CDATA[Event]]></category>
		<category><![CDATA[Frontpage Article]]></category>
		<category><![CDATA[News]]></category>
		<guid isPermaLink="false">https://www.philipconrod.com/?p=12782</guid>

					<description><![CDATA[Back in February I started testing the NeoStack AI Plugin by BETIDE STUDIO for AI assisted Unreal Engine game development.   The NeoStack AI plugin is still in Beta, and I am currently testing the latest beta version (v1.0.66). In addition to using OpenRouter to run free LLM models, NeoStack AI recently added the capability to [&#8230;]]]></description>
										<content:encoded><![CDATA[<p><img decoding="async" src="https://www.philipconrod.com/wp-content/uploads/2026/02/NeoStack-AI-Logo-1030x579.png" /></p>
<p>Back in February I started testing the NeoStack AI Plugin by BETIDE STUDIO for AI assisted Unreal Engine game development.   The NeoStack AI plugin is still in Beta, and I am currently testing the latest beta version (v1.0.66).</p>
<p><img fetchpriority="high" decoding="async" class="alignnone size-full wp-image-12783" src="https://www.philipconrod.com/wp-content/uploads/2026/04/NetoStack.1.0.66-Plugin.png" alt="" width="1218" height="100" srcset="https://www.philipconrod.com/wp-content/uploads/2026/04/NetoStack.1.0.66-Plugin.png 1218w, https://www.philipconrod.com/wp-content/uploads/2026/04/NetoStack.1.0.66-Plugin-300x25.png 300w, https://www.philipconrod.com/wp-content/uploads/2026/04/NetoStack.1.0.66-Plugin-1030x85.png 1030w, https://www.philipconrod.com/wp-content/uploads/2026/04/NetoStack.1.0.66-Plugin-768x63.png 768w, https://www.philipconrod.com/wp-content/uploads/2026/04/NetoStack.1.0.66-Plugin-705x58.png 705w, https://www.philipconrod.com/wp-content/uploads/2026/04/NetoStack.1.0.66-Plugin-1210x100.png 1210w" sizes="(max-width: 1218px) 100vw, 1218px" /></p>
<p>In addition to using OpenRouter to run free LLM models, NeoStack AI recently added the capability to run my free local LLM models via Ollama running on my GeForce RTX-3080.</p>
<p><img decoding="async" class="alignnone size-full wp-image-12784" src="https://www.philipconrod.com/wp-content/uploads/2026/04/Betide-NeoStack-AI-Ollama-Local-LLM-Model-Selection-Options.png" alt="" width="1006" height="996" srcset="https://www.philipconrod.com/wp-content/uploads/2026/04/Betide-NeoStack-AI-Ollama-Local-LLM-Model-Selection-Options.png 1006w, https://www.philipconrod.com/wp-content/uploads/2026/04/Betide-NeoStack-AI-Ollama-Local-LLM-Model-Selection-Options-300x297.png 300w, https://www.philipconrod.com/wp-content/uploads/2026/04/Betide-NeoStack-AI-Ollama-Local-LLM-Model-Selection-Options-80x80.png 80w, https://www.philipconrod.com/wp-content/uploads/2026/04/Betide-NeoStack-AI-Ollama-Local-LLM-Model-Selection-Options-768x760.png 768w, https://www.philipconrod.com/wp-content/uploads/2026/04/Betide-NeoStack-AI-Ollama-Local-LLM-Model-Selection-Options-36x36.png 36w, https://www.philipconrod.com/wp-content/uploads/2026/04/Betide-NeoStack-AI-Ollama-Local-LLM-Model-Selection-Options-705x698.png 705w" sizes="(max-width: 1006px) 100vw, 1006px" /></p>
<p>I am really excited to use their new Ollama Local LLM model capability so I can take advantage of the free processing power of my offline NVIDIA GeForce RTX-3080 Graphics card.</p>
<p>As a reminder, the key features of the NeoStack AI are:</p>
<ul>
<li>Multi-Agent Support &#8211; Connect to Claude Code, Gemini CLI, or OpenRouter</li>
<li>Native Editor UI &#8211; Slate-based chat window with streaming responses</li>
<li>Asset Generation &#8211; Create Blueprints, Materials, Behavior Trees, Data Tables and more</li>
<li>Context Attachments &#8211; Attach Blueprint nodes or assets to your prompts</li>
<li>Project Indexing &#8211; Automatic project indexing for context-aware suggestions</li>
</ul>
<p>I&#8217;ll continue to keep you update on both my cloud and local LLM testing with NeoStack AI for Unreal Engine.</p>
<p>&nbsp;</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>OpenAI Releases GPT-5.4 and it is State of the Art (SOTA) across Key Coding Benchmarks</title>
		<link>https://www.philipconrod.com/openai-releases-gpt-5-4-and-it-is-state-of-the-art-sota-across-key-coding-benchmarks/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=openai-releases-gpt-5-4-and-it-is-state-of-the-art-sota-across-key-coding-benchmarks</link>
		
		<dc:creator><![CDATA[Philip Conrod]]></dc:creator>
		<pubDate>Wed, 11 Mar 2026 07:24:57 +0000</pubDate>
				<category><![CDATA[Event]]></category>
		<category><![CDATA[Frontpage Article]]></category>
		<category><![CDATA[News]]></category>
		<guid isPermaLink="false">https://www.philipconrod.com/?p=12748</guid>

					<description><![CDATA[On March 5th, 2026, OpenAI released GPT-5.4  GPT-5.4 brings together the best of their &#8220;recent advances in reasoning, coding, and agentic workflows into a single frontier model&#8221;. OpenAI said that GPT-5.4 &#8220;incorporates the industry-leading coding capabilities of GPT‑5.3‑Codex⁠ while improving how the model works across tools, software environments, and professional tasks involving spreadsheets, presentations, and [&#8230;]]]></description>
										<content:encoded><![CDATA[<p>On March 5th, 2026, OpenAI released GPT-5.4  GPT-5.4 brings together the best of their &#8220;recent advances in reasoning, coding, and agentic workflows into a single frontier model&#8221;. OpenAI said that GPT-5.4 &#8220;incorporates the industry-leading coding capabilities of GPT‑5.3‑Codex⁠ while improving how the model works across tools, software environments, and professional tasks involving spreadsheets, presentations, and documents.&#8221;  GPT 5.3 Coden is great for agentic coding and combining it with GPT-5.4 helps you get real work done accurately, effectively, and efficiently.</p>
<p>OpenAI shared the following benchmarks comparing 5.4 to 5.3 Codex and GPT 5.2 in their announcement:</p>
<p><a href="https://openai.com/index/introducing-gpt-5-4/" target="_blank" rel="noopener"><img decoding="async" class="aligncenter wp-image-12749 size-full" src="https://www.philipconrod.com/wp-content/uploads/2026/03/OpenAI-GPT-5.4-Benchmarks.png" alt="" width="631" height="305" srcset="https://www.philipconrod.com/wp-content/uploads/2026/03/OpenAI-GPT-5.4-Benchmarks.png 631w, https://www.philipconrod.com/wp-content/uploads/2026/03/OpenAI-GPT-5.4-Benchmarks-300x145.png 300w" sizes="(max-width: 631px) 100vw, 631px" /></a></p>
<p>&nbsp;</p>
<p>For Coding, they also shared their SWE-Bench Pro (public) Accuracy vs Latency Benchmarks:</p>
<p>&nbsp;</p>
<p><img loading="lazy" decoding="async" class="alignnone wp-image-12750 aligncenter" src="https://www.philipconrod.com/wp-content/uploads/2026/03/OpenAI-GPT-5.4-SWE-Bench-Pro-Benchmarks.png" alt="" width="631" height="435" srcset="https://www.philipconrod.com/wp-content/uploads/2026/03/OpenAI-GPT-5.4-SWE-Bench-Pro-Benchmarks.png 736w, https://www.philipconrod.com/wp-content/uploads/2026/03/OpenAI-GPT-5.4-SWE-Bench-Pro-Benchmarks-300x207.png 300w, https://www.philipconrod.com/wp-content/uploads/2026/03/OpenAI-GPT-5.4-SWE-Bench-Pro-Benchmarks-705x487.png 705w" sizes="auto, (max-width: 631px) 100vw, 631px" /></p>
<p>As a demonstration of their model’s improved computer-use and coding capabilities working in tandem, they released an experimental Codex skill called “Playwright (Interactive)⁠(opens in a new window)”. This new skill &#8220;allows Codex to visually debug web and Electron apps; it can even be used to test an app it’s building, as it’s building it.&#8221;</p>
<p>&nbsp;</p>
<p>I can&#8217;t wait to test and see how it has improved AI assisted game development!  I&#8217;ll let you know how my testing goes via my AI Assisted Unreal Engine video game development project using my Betide NeoStack AI Plugin and my OpenRouter account.</p>
<p><img loading="lazy" decoding="async" class=" wp-image-12756 aligncenter" src="https://www.philipconrod.com/wp-content/uploads/2026/03/OpenAI-GPT-5.4-Via-OpenRouter.png" alt="" width="644" height="388" srcset="https://www.philipconrod.com/wp-content/uploads/2026/03/OpenAI-GPT-5.4-Via-OpenRouter.png 1006w, https://www.philipconrod.com/wp-content/uploads/2026/03/OpenAI-GPT-5.4-Via-OpenRouter-300x181.png 300w, https://www.philipconrod.com/wp-content/uploads/2026/03/OpenAI-GPT-5.4-Via-OpenRouter-768x463.png 768w, https://www.philipconrod.com/wp-content/uploads/2026/03/OpenAI-GPT-5.4-Via-OpenRouter-705x425.png 705w" sizes="auto, (max-width: 644px) 100vw, 644px" /></p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Epic Games recently acquired Meshcapade, a startup specializing in AI technologies for creating and animating hyper-realistic digital human models and animations from video recordings </title>
		<link>https://www.philipconrod.com/epic-games-recently-acquired-meshcapade-a-startup-specializing-in-ai-technologies-for-creating-and-animating-hyper-realistic-digital-human-models-and-animations-from-video-recordings/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=epic-games-recently-acquired-meshcapade-a-startup-specializing-in-ai-technologies-for-creating-and-animating-hyper-realistic-digital-human-models-and-animations-from-video-recordings</link>
		
		<dc:creator><![CDATA[Philip Conrod]]></dc:creator>
		<pubDate>Sun, 22 Feb 2026 08:01:57 +0000</pubDate>
				<category><![CDATA[Event]]></category>
		<category><![CDATA[Frontpage Article]]></category>
		<category><![CDATA[News]]></category>
		<guid isPermaLink="false">https://www.philipconrod.com/?p=12732</guid>

					<description><![CDATA[Epic Games recently acquired Meshcapade, a startup specializing in AI technologies for creating and animating hyper-realistic digital humans. Meshcapade, a spin-off from the Max Planck Institute for Intelligent Systems based in Tübingen, Germany, develops AI tools that generate precise 3D body models and animations from video recordings. Meshcapade&#8217;s AI tools, built on the SMPL (Skinned [&#8230;]]]></description>
										<content:encoded><![CDATA[<p>Epic Games recently acquired <a href="https://www.mpg.de/26082348/max-planck-spin-off-meshcapade-draws-epic-games-to-tuebingen" target="_blank" rel="nofollow noopener">Meshcapade</a>, a startup specializing in AI technologies for creating and animating hyper-realistic digital humans. <a href="https://meshcapade.com/" target="_blank" rel="nofollow noopener">Meshcapade</a>, a spin-off from the Max Planck Institute for Intelligent Systems based in Tübingen, Germany, develops AI tools that generate precise 3D body models and animations from video recordings.</p>
<p><img loading="lazy" decoding="async" class="alignnone size-large wp-image-12733" src="https://www.philipconrod.com/wp-content/uploads/2026/02/Meshcapade-Example-1-1030x476.png" alt="" width="1030" height="476" srcset="https://www.philipconrod.com/wp-content/uploads/2026/02/Meshcapade-Example-1-1030x476.png 1030w, https://www.philipconrod.com/wp-content/uploads/2026/02/Meshcapade-Example-1-300x139.png 300w, https://www.philipconrod.com/wp-content/uploads/2026/02/Meshcapade-Example-1-768x355.png 768w, https://www.philipconrod.com/wp-content/uploads/2026/02/Meshcapade-Example-1-1536x709.png 1536w, https://www.philipconrod.com/wp-content/uploads/2026/02/Meshcapade-Example-1-1500x693.png 1500w, https://www.philipconrod.com/wp-content/uploads/2026/02/Meshcapade-Example-1-705x326.png 705w, https://www.philipconrod.com/wp-content/uploads/2026/02/Meshcapade-Example-1.png 1975w" sizes="auto, (max-width: 1030px) 100vw, 1030px" /></p>
<p dir="auto">Meshcapade&#8217;s AI tools, built on the SMPL (Skinned Multi-Person Linear) parametric body model, primarily target and automate the body modeling and full-body motion capture stages in Epic&#8217;s MetaHuman pipeline, which currently relies on limited presets for bodies and less advanced video-based tracking for animation.</p>
<p>What Meshcapade Replaces in the current MetaHuman Pipeline:</p>
<ol>
<li>Preset/Manual Body Modeling: Replaces MetaHuman Creator&#8217;s ~50 preset bodies and external sculpting/rigging with AI-generated custom SMPL bodies from a single image, video clip, or scan. It extracts precise shape, pose, and clothing details automatically—no manual adjustments needed.</li>
<li>Expensive Mocap Hardware: Replaces marker/suit-based systems (e.g., optical mocap studios) with markerless full-body capture from any camera (phone/webcam/pro). It tracks subtle motions like fingers/hands, camera movement, and multi-person scenes.</li>
<li>Fragmented Body Animation: Complements/enhances MetaHuman Animator&#8217;s body solve with superior AI-driven full-body mocap, reducing retargeting hassles (pre-acquisition the plugin existed; now it will become native).</li>
</ol>
<p>With the acquisition, Epic says it’s “looking forward to working together to advance digital human technologies for use across gaming, film and entertainment.”  I&#8217;m personally looking forward to a much simpler MetaHuman creation and animation pipeline for video game development and virtual production. development.</p>
<p>&nbsp;</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Claude Sonnet 4.6 Released on February 17, 2026</title>
		<link>https://www.philipconrod.com/claude-sonnet-4-6-released-on-february-17-2026/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=claude-sonnet-4-6-released-on-february-17-2026</link>
		
		<dc:creator><![CDATA[Philip Conrod]]></dc:creator>
		<pubDate>Sat, 21 Feb 2026 23:05:03 +0000</pubDate>
				<category><![CDATA[Event]]></category>
		<category><![CDATA[Frontpage Article]]></category>
		<category><![CDATA[News]]></category>
		<guid isPermaLink="false">https://www.philipconrod.com/?p=12721</guid>

					<description><![CDATA[Anthropic released Claude Sonnet 4.6 on February 17, 2026 &#8211; almost 2 weeks after releasing their flagship Claude Opus 4.6 model.   Below are the latest benchmarks that Anthropic published in their Claude Sonnet 4.6 announcement: &#160; You can read more about Anthropic Claude Sonnet 4.6 here.]]></description>
										<content:encoded><![CDATA[<p>Anthropic released Claude Sonnet 4.6 on February 17, 2026 &#8211; almost 2 weeks after releasing their flagship Claude Opus 4.6 model.   Below are the latest benchmarks that Anthropic published in their Claude Sonnet 4.6 announcement:</p>
<p>&nbsp;</p>
<p><a href="https://www.anthropic.com/news/claude-sonnet-4-6?hsLang=en" target="_blank" rel="noopener rel=&quot;nofollow&quot;"><img loading="lazy" decoding="async" class="alignnone wp-image-12722" src="https://www.philipconrod.com/wp-content/uploads/2026/02/Antropic-Benchmarks-Sonnet-4.6.png" alt="" width="587" height="670" srcset="https://www.philipconrod.com/wp-content/uploads/2026/02/Antropic-Benchmarks-Sonnet-4.6.png 862w, https://www.philipconrod.com/wp-content/uploads/2026/02/Antropic-Benchmarks-Sonnet-4.6-263x300.png 263w, https://www.philipconrod.com/wp-content/uploads/2026/02/Antropic-Benchmarks-Sonnet-4.6-768x876.png 768w, https://www.philipconrod.com/wp-content/uploads/2026/02/Antropic-Benchmarks-Sonnet-4.6-618x705.png 618w" sizes="auto, (max-width: 587px) 100vw, 587px" /></a></p>
<p>You can read more about Anthropic Claude Sonnet 4.6 <a href="https://www.anthropic.com/news/claude-sonnet-4-6?hsLang=en" target="_blank" rel="nofollow noopener">here</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>AI Co-Development Just Got a Lot Easier with the Release of Claude Opus 4.6 and Open AI GPT 5.3 Codex on February 5, 2026</title>
		<link>https://www.philipconrod.com/ai-co-development-just-got-alot-easier-with-the-release-of-claude-opus-4-6-and-open-ai-gpt-5-3-codex-on-february-5-2026/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=ai-co-development-just-got-alot-easier-with-the-release-of-claude-opus-4-6-and-open-ai-gpt-5-3-codex-on-february-5-2026</link>
		
		<dc:creator><![CDATA[Philip Conrod]]></dc:creator>
		<pubDate>Fri, 13 Feb 2026 19:14:30 +0000</pubDate>
				<category><![CDATA[Frontpage Article]]></category>
		<category><![CDATA[News]]></category>
		<guid isPermaLink="false">https://www.philipconrod.com/?p=12692</guid>

					<description><![CDATA[In this article, I will compare two specific domains: coding an agentic application (e.g., autonomous AI agents that use tools, plan multi-step workflows, interact with environments like terminals/browsers, debug in loops, and handle real-world software engineering) and game development (e.g., designing game logic, implementing mechanics in engines like Pygame/Unity-style code, handling physics/AI/pathfinding, procedural generation, balancing [&#8230;]]]></description>
										<content:encoded><![CDATA[<p>In this article, I will compare two specific domains: coding an agentic application (e.g., autonomous AI agents that use tools, plan multi-step workflows, interact with environments like terminals/browsers, debug in loops, and handle real-world software engineering) and game development (e.g., designing game logic, implementing mechanics in engines like Pygame/Unity-style code, handling physics/AI/pathfinding, procedural generation, balancing systems, and iterating on prototypes).</p>
<p>On February 5, 2026 <strong>Anthropic</strong> released <a href="https://www.anthropic.com/news/claude-opus-4-6" target="_blank" rel="nofollow noopener"><strong>Claude Opus 4.6</strong></a> which is their current flagship, optimized for the most demanding, long-horizon, and complex tasks.  On the very same day, <strong>OpenAI released <a href="https://openai.com/index/introducing-gpt-5-3-codex/" target="_blank" rel="nofollow noopener">GPT-5.3 Codex</a></strong> (the dedicated agentic coding line). GPT-5.3 Codex is OpenAI&#8217;s current flagship for advanced coding and agentic workflows.</p>
<p><strong>Key Specs Comparison: </strong></p>
<p><strong>Context Window: </strong>Claude Sonnet 4.5: 200K tokens Claude Opus 4.6: 200K standard / <strong>1M tokens (beta, with much higher reliability)</strong> GPT-5.3-Codex: Not explicitly stated in releases (likely 200K–512K range, consistent with GPT-5 family; strong on long-horizon without 1M claims yet)</p>
<p><strong>Pricing (API, per million tokens)</strong> Sonnet 4.5: $3 input / $15 output Opus 4.6: $5 input / $25 output (higher beyond 200K) GPT-5.3-Codex: Similar to prior GPT-5.x tiers (~$5–15 input / $15–75 output range; exact Codex rates match high-tier GPT access via app/CLI/API)</p>
<p><strong>Speed</strong> Sonnet 4.5: Fastest of the three for iteration Opus 4.6: Slower (deeper thinking/effort modes) GPT-5.3-Codex: ~25% faster than GPT-5.2-Codex; feels responsive for agent steering</p>
<p><strong>Reasoning/Agent Style</strong> All three are hybrid/agentic-tuned. Opus 4.6 has adaptive effort + best long-horizon sustain. GPT-5.3-Codex emphasizes steerable, interactive agents (you guide mid-task without context loss). Sonnet 4.5 balances speed + quality.</p>
<p><strong>Performance on Agentic Coding / Building Agentic Applications:</strong></p>
<p>This category is extremely competitive right now — the February 5 releases were direct head-to-heads.</p>
<ul>
<li><strong>Terminal-Bench 2.0</strong> (terminal/tool-use agentic coding): GPT-5.3-Codex leads at ~77.3% (state-of-the-art claim); Opus 4.6 close behind (~65–72% range in prior reports, but Anthropic claims top on agent evals); Sonnet 4.5 lower (~51%).</li>
<li><strong>SWE-Bench Pro/Verified</strong> (real GitHub issues, multi-lang): GPT-5.3-Codex hits new highs on Pro variant (multi-lang, harder); Opus 4.6 edges on Verified for complex fixes (~80–81%); Sonnet 4.5 strong but trails slightly.</li>
<li><strong>OSWorld / Computer Use</strong>: GPT-5.3-Codex strong (~64–65%); Opus 4.6 leads in sustained GUI/terminal chains.</li>
<li><strong>Long-horizon Agents</strong>: Opus 4.6 excels at multi-hour autonomous runs, massive context, self-correction. GPT-5.3-Codex shines on interactive steering (like a live colleague) + tool/research chaining. Sonnet 4.5 great for volume/prototyping but needs more retries on extreme complexity.</li>
<li><strong>Real-world vibe</strong>: Developers report GPT-5.3-Codex feels &#8220;more Claude-like&#8221; than prior OpenAI models (better git, debugging, broad tasks). Opus 4.6 often wins on deep codebase navigation + fewer hallucinations. Many use both in parallel.</li>
</ul>
<p>Below is Anthropic&#8217;s Terminal Bench 2.0 Rating of 65.4% published by Anthropic:</p>
<p><img loading="lazy" decoding="async" class="wp-image-12696 alignnone" src="https://www.philipconrod.com/wp-content/uploads/2026/02/Anthropic-Claude-Opus-46-Terminal-Bench-2-1030x402.png" alt="" width="771" height="301" srcset="https://www.philipconrod.com/wp-content/uploads/2026/02/Anthropic-Claude-Opus-46-Terminal-Bench-2-1030x402.png 1030w, https://www.philipconrod.com/wp-content/uploads/2026/02/Anthropic-Claude-Opus-46-Terminal-Bench-2-300x117.png 300w, https://www.philipconrod.com/wp-content/uploads/2026/02/Anthropic-Claude-Opus-46-Terminal-Bench-2-768x300.png 768w, https://www.philipconrod.com/wp-content/uploads/2026/02/Anthropic-Claude-Opus-46-Terminal-Bench-2-1536x600.png 1536w, https://www.philipconrod.com/wp-content/uploads/2026/02/Anthropic-Claude-Opus-46-Terminal-Bench-2-1500x586.png 1500w, https://www.philipconrod.com/wp-content/uploads/2026/02/Anthropic-Claude-Opus-46-Terminal-Bench-2-705x275.png 705w, https://www.philipconrod.com/wp-content/uploads/2026/02/Anthropic-Claude-Opus-46-Terminal-Bench-2.png 1592w" sizes="auto, (max-width: 771px) 100vw, 771px" /></p>
<p>Below is OpenAI&#8217;s GPT-5.3 Codex Terminal Bench 2.0 Rating of 77.3% published by OpenAI:</p>
<p><img loading="lazy" decoding="async" class="wp-image-12697 alignnone" src="https://www.philipconrod.com/wp-content/uploads/2026/02/OpenAI-53-CodexTerminal-Bench-2.png" alt="" width="341" height="364" srcset="https://www.philipconrod.com/wp-content/uploads/2026/02/OpenAI-53-CodexTerminal-Bench-2.png 766w, https://www.philipconrod.com/wp-content/uploads/2026/02/OpenAI-53-CodexTerminal-Bench-2-281x300.png 281w, https://www.philipconrod.com/wp-content/uploads/2026/02/OpenAI-53-CodexTerminal-Bench-2-660x705.png 660w" sizes="auto, (max-width: 341px) 100vw, 341px" /></p>
<p><strong>For serious agentic apps</strong> (autonomous tools, computer-use agents, production reliability): It&#8217;s a toss-up between <strong>Opus 4.6</strong> (best long sustain + 1M context) and <strong>GPT-5.3-Codex</strong> (interactive steering + benchmark edges). Sonnet 4.5 is excellent but not quite frontier here.</p>
<p><strong>Performance on Game Development</strong></p>
<p>Game development favors creativity + systems integration + rapid iteration (mechanics, AI, procedural gen, balancing, physics sims).  All three handle Pygame prototypes, Godot/Unity pseudocode, NPC AI, etc., very well.</p>
<ul>
<li><strong>Sonnet 4.5</strong>: Best for fast prototypes/iterations — clean loops, quick balancing tweaks, simple pathfinding/behavior trees. Preferred for indie-speed work.</li>
<li><strong>Opus 4.6</strong>: Pulls ahead on complex, interconnected systems (e.g., economy + AI opponents + physics + UI + narrative). Better at coherent large-scale design → code chains, debugging edge cases in bigger codebases, creative-yet-coherent ideas.</li>
<li><strong>GPT-5.3-Codex</strong>: Strong contender — OpenAI highlights building &#8220;highly functional complex games and apps from scratch over days.&#8221; Excels at multi-step execution + research/tool use (e.g., pulling refs for mechanics). Feels more &#8220;productive&#8221; for iterative game workflows.</li>
</ul>
<p><strong>For most game prototypes/indie development</strong>: <strong>Sonnet 4.5</strong> (speed + cost) or <strong>GPT-5.3-Codex</strong> (if you want interactive guidance).</p>
<p><strong>For ambitious/systems-heavy games</strong> (e.g., simulation layers, procedural worlds, long development cycles): <strong>Opus 4.6</strong> edges out slightly on cohesion + autonomy, but <strong>GPT-5.3-Codex</strong> is very close and often faster to iterate.</p>
<p><strong>My Current Recommendation for Cloud AI Agent Assisted Development:</strong></p>
<ul>
<li><strong>Claude Sonnet 4.5</strong> — Default choice for 80% of agentic coding + game development: fast, cheap, reliable near-frontier.</li>
<li><strong>Claude Opus 4.6</strong> — Pick for the hardest agentic/long-horizon work, massive context, or when you need max reliability/cohesion (especially complex games or production agents).</li>
<li><strong>GPT-5.3-Codex</strong> — Pick (or combine) if you value interactive steering, OpenAI ecosystem (Codex app/CLI/Copilot integration), or hit edges where OpenAI&#8217;s benchmarks shine. It&#8217;s neck-and-neck with Opus 4.6 right now — many developers test both on the same task.</li>
</ul>
<p>I am currently using the Claude models with my OpenRouter account as I test the  Neostack AI Plugin for Unreal Game Development.</p>
<p><img loading="lazy" decoding="async" class="alignnone wp-image-12719 size-full" src="https://www.philipconrod.com/wp-content/uploads/2026/02/Openrouter-Betitde-OpenStack-AI-Latest-Model-Testing.png" alt="" width="763" height="505" srcset="https://www.philipconrod.com/wp-content/uploads/2026/02/Openrouter-Betitde-OpenStack-AI-Latest-Model-Testing.png 763w, https://www.philipconrod.com/wp-content/uploads/2026/02/Openrouter-Betitde-OpenStack-AI-Latest-Model-Testing-300x199.png 300w, https://www.philipconrod.com/wp-content/uploads/2026/02/Openrouter-Betitde-OpenStack-AI-Latest-Model-Testing-705x467.png 705w" sizes="auto, (max-width: 763px) 100vw, 763px" /></p>
<p>Bottom line, the AI Assisted Video Game Development race is razor-close! Real preference often comes down to workflow (e.g., Claude Code vs. Codex app), ecosystem lock-in, and the development tools for your specific game project.   In a future post, I will go over using Claude with the Unreal Engine Game and the Betide Noestack AI Plugin.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Co-Developing Video Games using the Latest version of the NeoStack AI Unreal Game Engine Plugin Using Multiple LLMS via OpenRouter</title>
		<link>https://www.philipconrod.com/co-developing-video-games-using-the-latest-version-of-the-neostack-ai-unreal-game-engine-plugin-using-multiple-llms-via-openrouter/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=co-developing-video-games-using-the-latest-version-of-the-neostack-ai-unreal-game-engine-plugin-using-multiple-llms-via-openrouter</link>
		
		<dc:creator><![CDATA[Philip Conrod]]></dc:creator>
		<pubDate>Mon, 09 Feb 2026 01:10:55 +0000</pubDate>
				<category><![CDATA[Frontpage Article]]></category>
		<category><![CDATA[News]]></category>
		<guid isPermaLink="false">https://www.philipconrod.com/?p=12666</guid>

					<description><![CDATA[I am currently testing the NeoStack AI Plugin by BETIDE STUDIO for AI assisted Unreal Engine game development.   The NeoStack AI plugin is currently in Beta, and the original beta was released on the Epic FAB store on January 20, 2026.  I am currently testing the latest beta version (v0.3.1) which was released on 2/9/2026. [&#8230;]]]></description>
										<content:encoded><![CDATA[<p><img loading="lazy" decoding="async" class="alignnone size-large wp-image-12667" src="https://www.philipconrod.com/wp-content/uploads/2026/02/NeoStack-AI-Logo-1030x579.png" alt="" width="1030" height="579" srcset="https://www.philipconrod.com/wp-content/uploads/2026/02/NeoStack-AI-Logo-1030x579.png 1030w, https://www.philipconrod.com/wp-content/uploads/2026/02/NeoStack-AI-Logo-300x169.png 300w, https://www.philipconrod.com/wp-content/uploads/2026/02/NeoStack-AI-Logo-768x431.png 768w, https://www.philipconrod.com/wp-content/uploads/2026/02/NeoStack-AI-Logo-1536x863.png 1536w, https://www.philipconrod.com/wp-content/uploads/2026/02/NeoStack-AI-Logo-1500x843.png 1500w, https://www.philipconrod.com/wp-content/uploads/2026/02/NeoStack-AI-Logo-705x396.png 705w, https://www.philipconrod.com/wp-content/uploads/2026/02/NeoStack-AI-Logo.png 1919w" sizes="auto, (max-width: 1030px) 100vw, 1030px" /></p>
<p>I am currently testing the NeoStack AI Plugin by BETIDE STUDIO for AI assisted Unreal Engine game development.   The NeoStack AI plugin is currently in Beta, and the original beta was released on the Epic FAB store on January 20, 2026.  I am currently testing the latest beta version (v0.3.1) which was released on 2/9/2026.</p>
<p><img loading="lazy" decoding="async" class="alignnone wp-image-12686" src="https://www.philipconrod.com/wp-content/uploads/2026/02/NetoStack.0.3.1-Plugin-1030x72.png" alt="" width="1016" height="71" srcset="https://www.philipconrod.com/wp-content/uploads/2026/02/NetoStack.0.3.1-Plugin-1030x72.png 1030w, https://www.philipconrod.com/wp-content/uploads/2026/02/NetoStack.0.3.1-Plugin-300x21.png 300w, https://www.philipconrod.com/wp-content/uploads/2026/02/NetoStack.0.3.1-Plugin-768x53.png 768w, https://www.philipconrod.com/wp-content/uploads/2026/02/NetoStack.0.3.1-Plugin-705x49.png 705w, https://www.philipconrod.com/wp-content/uploads/2026/02/NetoStack.0.3.1-Plugin.png 1209w" sizes="auto, (max-width: 1016px) 100vw, 1016px" /></p>
<p>The developer is constantly releasing updates and fixes so I expect I will be constantly updating the plugin while it is in Beta.   I am looking forward to seeing how the Agent Integration plugin evolves over the next several months.</p>
<p>To give you a perspective of how rapidly the plugin is evolving, here is a list of updates that were released over the last couple of weeks:</p>
<p>The v.0.3 beta update which released the other day added the following features and fixes:</p>
<ul>
<li>Composite Graphs, Comments, Macros, Local Variables, Component, Reparenting, Blueprint Interfaces are now supported! (This means agents now can do 100% of things that you can in regards of Blueprints)</li>
<li>Almost 500+ additional checks added over the plugin to reduce the number of crashes. Just want to say, if you crash, please let us know! We want to take the number of crashes to 0</li>
<li>Subconfigs added which means you can configure what tools a profile has</li>
<li>Session Resume has been fixed &#8211; No need to tell the agent what you already have</li>
<li>You can now see Usage Limits for OpenAi Codex/Claude Code in the plugin (See Image)</li>
<li>Codex Reasoning Parameter is now supported</li>
<li>Task Completion Notifications — Toasts, taskbar flash, and sound playback when the agent finishes (Configurable)</li>
<li>Montage Editing made better</li>
<li>Pin Splitting/Recombining &amp; Dynamic Exec Pins is now supported</li>
<li>GitHub Copilot Native CLI support added</li>
<li>Fixed Packaging Issues if using Plugin</li>
<li>Class Defaults editing — ConfigureAssetTool can now set Actor CDO properties like bReplicates, NetUpdateFrequency, AutoPossessPlayer</li>
<li>Last-used agent persistence — Remembers which agent you used last instead of defaulting to Claude Code every time</li>
<li>Fixed issues with if you already are using the port for MCP</li>
</ul>
<p>The v.0.2 beta update from the week before added the following new features and fixes:</p>
<ul>
<li>Full Animation Montage editing!</li>
<li>Enhanced Input support! Create &amp; edit Enhanced Input assets</li>
<li>Agent can now see what&#8217;s in your viewport! Captures the active viewport as an image with camera info</li>
<li>Level Sequencer has been completely rewritten</li>
<li>Agent can now set any node property directly on Behaviour Trees</li>
<li>BT reads now always show full node tree + blackboard details. Blueprint vars show replication flags. New readers for Enhanced Input &amp; Gameplay Effects. Level Sequence reads are way more detailed now</li>
<li>New &#8220;Think&#8221; toggle for Claude Code — control how much the agent reasons</li>
<li>Streaming no longer bounces — smooth throttled updates</li>
<li>Message layout redesigned — cleaner look</li>
<li>Session saving no longer freezes UI on tab switch</li>
<li>Server now blocks browser requests by default (CSRF protection)</li>
<li>24 Crash Fixes (It is still in Beta!)</li>
</ul>
<p>Prior to releasing the beta plugin, Betide Studio published a standalone Windows &amp; Mac version of the tool called NeoAI.  NeoStack AI now works across virtually every major Unreal Engine system as it is a native Unreal Engine Plugin.  It currently supports the following Unreal Engine systems:</p>
<ul>
<li>Blueprints</li>
<li>Materials</li>
<li>Animation Blueprints</li>
<li>Behaviour Trees</li>
<li>State Trees</li>
<li>Structs</li>
<li>Enums</li>
<li>DataTables</li>
<li>Niagara VFX</li>
<li>Level Sequences</li>
<li>IK Rigs</li>
<li>Animation Montages</li>
<li>Enhanced Input</li>
<li>Motion Matching</li>
<li>PCG</li>
<li>MetaSounds</li>
<li>and much more planned over the next several months</li>
</ul>
<p>NeoStack AI is currently priced at $109.99 on the FAB Store.  You can read more about all the different features <a href="https://aik.betide.studio" target="_blank" rel="nofollow noopener">here</a>.</p>
<p>It currently works with Claude, Gemini and OpenAI CLI.  Claude Code &amp; Gemini CLI integration requires those tools to be installed separately.  I personally prefer using my OpenRouter API Key so I can pick which AI model to use depending on the which development task I am currently doing.   Betide Studio currently recommends using Claude Code as it works best for creating Blueprint logic while the other models produce less than desired results.  The best part of NeoStack AI is that you get to choose which model to use (400+) and how much you want to pay for tokens.  You can even use the free models available on <a href="https://openrouter.ai/collections/free-models" target="_blank" rel="nofollow noopener">OpenRouter</a> (e.g. GLM, DeepSeek, Qwen Coder, etc.) if you don&#8217;t want to pay for tokens and you just want to play around with it.</p>
<p>The Default Agent LLMs listed in the OpenRouter Dropdown menu are the following:</p>
<p><img loading="lazy" decoding="async" class="alignnone size-full wp-image-12690" src="https://www.philipconrod.com/wp-content/uploads/2026/02/NetoStack-Plugin-Open-Router-LLM-Default-Options.png" alt="" width="250" height="243" srcset="https://www.philipconrod.com/wp-content/uploads/2026/02/NetoStack-Plugin-Open-Router-LLM-Default-Options.png 250w, https://www.philipconrod.com/wp-content/uploads/2026/02/NetoStack-Plugin-Open-Router-LLM-Default-Options-36x36.png 36w" sizes="auto, (max-width: 250px) 100vw, 250px" /></p>
<p>Once you install the plugin and properly configure it, you get access to this Agent Chat Window.  I configured the plugin to connect to OpenRouter with my API Key:</p>
<p><img loading="lazy" decoding="async" class="alignnone size-full wp-image-12687" src="https://www.philipconrod.com/wp-content/uploads/2026/02/NeoStackAgentChatWindow.png" alt="" width="437" height="613" srcset="https://www.philipconrod.com/wp-content/uploads/2026/02/NeoStackAgentChatWindow.png 437w, https://www.philipconrod.com/wp-content/uploads/2026/02/NeoStackAgentChatWindow-214x300.png 214w" sizes="auto, (max-width: 437px) 100vw, 437px" /></p>
<p>I&#8217;ll provide a follow up on my beta testing process after I have tested the plugin over the next several weeks.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Using the GLM 4.7V Flash Local LLM Model by Z.ai to Develop a Moon Landing Simulation Using C# on my Alienware Aurora R11 RTX-3080 10GB Video Card</title>
		<link>https://www.philipconrod.com/using-the-glm-4-7v-flash-local-llm-model-by-z-ai-to-develop-a-moon-landing-simulation-using-c-on-my-alienware-aurora-r11-rtx-3080-10gb-video-card/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=using-the-glm-4-7v-flash-local-llm-model-by-z-ai-to-develop-a-moon-landing-simulation-using-c-on-my-alienware-aurora-r11-rtx-3080-10gb-video-card</link>
		
		<dc:creator><![CDATA[Philip Conrod]]></dc:creator>
		<pubDate>Thu, 22 Jan 2026 04:40:34 +0000</pubDate>
				<category><![CDATA[Event]]></category>
		<category><![CDATA[Frontpage Article]]></category>
		<category><![CDATA[News]]></category>
		<guid isPermaLink="false">https://www.philipconrod.com/?p=12647</guid>

					<description><![CDATA[The GLM-4.7 Flash model, an Open Weight 30B-parameter  Mixture of Agents (MoE) variant, released by Beijing Zhipu Huazhang Technology Co., Ltd on January 19, 2026, is positioned as a lightweight, efficient option for local deployment and agentic tasks like coding. The Lunar Lander coding simulation coding test ran without any problems at 6.12 tokens per [&#8230;]]]></description>
										<content:encoded><![CDATA[<div draggable="false" data-pm-slice="1 1 []" data-en-clipboard="true">The GLM-4.7 Flash model, an Open Weight 30B-parameter  Mixture of Agents (MoE) variant, released by Beijing Zhipu Huazhang Technology Co., Ltd on January 19, 2026, is positioned as a lightweight, efficient option for local deployment and agentic tasks like coding.</div>
<div draggable="false" data-pm-slice="1 1 []" data-en-clipboard="true"></div>
<div draggable="false" data-pm-slice="1 1 []" data-en-clipboard="true"><img loading="lazy" decoding="async" class="alignnone size-full wp-image-12648" src="https://www.philipconrod.com/wp-content/uploads/2026/01/GLM-4.7-Flash-LM-Studio.png" alt="" width="687" height="299" srcset="https://www.philipconrod.com/wp-content/uploads/2026/01/GLM-4.7-Flash-LM-Studio.png 687w, https://www.philipconrod.com/wp-content/uploads/2026/01/GLM-4.7-Flash-LM-Studio-300x131.png 300w" sizes="auto, (max-width: 687px) 100vw, 687px" /></div>
<div draggable="false" data-pm-slice="1 1 []" data-en-clipboard="true"></div>
<p>The Lunar Lander coding simulation coding test ran without any problems at 6.12 tokens per second using LM Studio:</p>
<p><img loading="lazy" decoding="async" class="alignnone size-full wp-image-12649" src="https://www.philipconrod.com/wp-content/uploads/2026/01/GLM-4.7-Flash-Lunar-Lander-Geforce-RTX-3080.png" alt="" width="822" height="994" srcset="https://www.philipconrod.com/wp-content/uploads/2026/01/GLM-4.7-Flash-Lunar-Lander-Geforce-RTX-3080.png 822w, https://www.philipconrod.com/wp-content/uploads/2026/01/GLM-4.7-Flash-Lunar-Lander-Geforce-RTX-3080-248x300.png 248w, https://www.philipconrod.com/wp-content/uploads/2026/01/GLM-4.7-Flash-Lunar-Lander-Geforce-RTX-3080-768x929.png 768w, https://www.philipconrod.com/wp-content/uploads/2026/01/GLM-4.7-Flash-Lunar-Lander-Geforce-RTX-3080-583x705.png 583w" sizes="auto, (max-width: 822px) 100vw, 822px" /></p>
<p>Below you can see the Task Manager Performance Chart for my NVIDIA GeForce RTX-3080 with 10GB VRAM;</p>
<p><img loading="lazy" decoding="async" class="alignnone size-large wp-image-12650" src="https://www.philipconrod.com/wp-content/uploads/2026/01/GLM-4.7-Flash-Lunar-Lander-Geforce-RTX-3080-GPU-Performance-1030x673.png" alt="" width="1030" height="673" srcset="https://www.philipconrod.com/wp-content/uploads/2026/01/GLM-4.7-Flash-Lunar-Lander-Geforce-RTX-3080-GPU-Performance-1030x673.png 1030w, https://www.philipconrod.com/wp-content/uploads/2026/01/GLM-4.7-Flash-Lunar-Lander-Geforce-RTX-3080-GPU-Performance-300x196.png 300w, https://www.philipconrod.com/wp-content/uploads/2026/01/GLM-4.7-Flash-Lunar-Lander-Geforce-RTX-3080-GPU-Performance-768x502.png 768w, https://www.philipconrod.com/wp-content/uploads/2026/01/GLM-4.7-Flash-Lunar-Lander-Geforce-RTX-3080-GPU-Performance-705x461.png 705w, https://www.philipconrod.com/wp-content/uploads/2026/01/GLM-4.7-Flash-Lunar-Lander-Geforce-RTX-3080-GPU-Performance.png 1351w" sizes="auto, (max-width: 1030px) 100vw, 1030px" /></p>
<p>Based on available benchmark data, it achieved the following scores on the specified evaluations:</p>
<p>GPQA: 75.2%<br />
AIME 25: 91.6%<br />
SWE-bench Verified: 59.2%</p>
<p>These results position it as a strong performer in its size class, particularly for coding and reasoning, outperforming comparably sized models like Qwen3-30B on SWE-bench Verified while maintaining lower resource requirements. Note that benchmarks can vary slightly based on evaluation configurations, but these figures are consistent across official sources and reviews.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Using GLM 4.6V Flash Local LLM Model by Z.ai to Develop a Moon Landing Simulation Using C# on my Alienware Aurora R11 RTX-3080 10GB Video Card</title>
		<link>https://www.philipconrod.com/using-glm-4-6v-flash-local-llm-model-by-z-ai-to-develop-a-moon-landing-simulation-using-c-on-my-alienware-aurora-r11-rtx-3080-10gb-video-card/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=using-glm-4-6v-flash-local-llm-model-by-z-ai-to-develop-a-moon-landing-simulation-using-c-on-my-alienware-aurora-r11-rtx-3080-10gb-video-card</link>
		
		<dc:creator><![CDATA[Philip Conrod]]></dc:creator>
		<pubDate>Thu, 25 Dec 2025 19:13:21 +0000</pubDate>
				<category><![CDATA[News]]></category>
		<guid isPermaLink="false">https://www.philipconrod.com/?p=12587</guid>

					<description><![CDATA[Over the past 6 months, I have continued to test locally hosted open-source multimodal agentic models which could run comfortably on my Nvidia RTX 3080 with 10GB.  Back in the August, I added GLM 4.5 to my testbench as it was surpassing or matching DeepSeek V3, Qwen 2.5 Coder, and Llama 3.1 in benchmarks. At [&#8230;]]]></description>
										<content:encoded><![CDATA[<div draggable="false" data-pm-slice="1 1 []" data-en-clipboard="true">Over the past 6 months, I have continued to test locally hosted open-source multimodal agentic models which could run comfortably on my Nvidia RTX 3080 with 10GB.  Back in the August, I added GLM 4.5 to my testbench as it was surpassing or matching DeepSeek V3, Qwen 2.5 Coder, and Llama 3.1 in benchmarks. At the time, I was busy testing OpenAI GPT OSS so I didn&#8217;t get a chance to write a blog post about my GLM 4.5 testing.</div>
<div draggable="false" data-pm-slice="1 1 []" data-en-clipboard="true"></div>
<div draggable="false" data-pm-slice="1 1 []" data-en-clipboard="true">Z.ai (Zhipu AI) recently launched GLM 4.6V Flash on LM Studio which is a 9B vision-language model optimized for local deployment and low-latency applications. It supports a context length of 128k tokens and achieves strong performance in visual understanding among models of similar scale.  The model introduces native multimodal function calling, enabling vision-driven tool use where images, screenshots, and document pages can be passed directly as tool inputs without text conversion.</div>
<div draggable="false" data-pm-slice="1 1 []" data-en-clipboard="true"></div>
<div draggable="false" data-pm-slice="1 1 []" data-en-clipboard="true">It needs a &#8220;minimum&#8221; of 8GB VRAM to run so I ran my standard Lunar Lander Simulation coding test using the default 4096 token context window. My initial coding test quickly reached the LM Studio default 4086 context window and errored out with a &#8220;Failed to send message&#8221; after 1 minute and 44 seconds:</div>
<div draggable="false"></div>
<p><img loading="lazy" decoding="async" class="alignnone size-full wp-image-12589" src="https://www.philipconrod.com/wp-content/uploads/2025/12/GLM-4.6-Flash-Lunar-Lander-Default-4096-Parameter-Context-window.png" alt="" width="663" height="614" srcset="https://www.philipconrod.com/wp-content/uploads/2025/12/GLM-4.6-Flash-Lunar-Lander-Default-4096-Parameter-Context-window.png 663w, https://www.philipconrod.com/wp-content/uploads/2025/12/GLM-4.6-Flash-Lunar-Lander-Default-4096-Parameter-Context-window-300x278.png 300w" sizes="auto, (max-width: 663px) 100vw, 663px" /></p>
<p>I then reloaded the model with a maximum token context window of 131072 with maximum layers offloaded to the GPU:</p>
<p><img loading="lazy" decoding="async" class="alignnone size-full wp-image-12590" src="https://www.philipconrod.com/wp-content/uploads/2025/12/GLM-4.6-Flash-Lunar-Lander-131072-Context-Window.png" alt="" width="822" height="994" srcset="https://www.philipconrod.com/wp-content/uploads/2025/12/GLM-4.6-Flash-Lunar-Lander-131072-Context-Window.png 822w, https://www.philipconrod.com/wp-content/uploads/2025/12/GLM-4.6-Flash-Lunar-Lander-131072-Context-Window-248x300.png 248w, https://www.philipconrod.com/wp-content/uploads/2025/12/GLM-4.6-Flash-Lunar-Lander-131072-Context-Window-768x929.png 768w, https://www.philipconrod.com/wp-content/uploads/2025/12/GLM-4.6-Flash-Lunar-Lander-131072-Context-Window-583x705.png 583w" sizes="auto, (max-width: 822px) 100vw, 822px" /></p>
<p>The Lunar Lander coding simulation coding test then ran without any problems at 11.69 tokens per second:</p>
<p><img loading="lazy" decoding="async" class="alignnone size-full wp-image-12588" src="https://www.philipconrod.com/wp-content/uploads/2025/12/GLM-4.6-Flash-Lunar-Lander-131072-Max-Parameter.png" alt="" width="822" height="994" srcset="https://www.philipconrod.com/wp-content/uploads/2025/12/GLM-4.6-Flash-Lunar-Lander-131072-Max-Parameter.png 822w, https://www.philipconrod.com/wp-content/uploads/2025/12/GLM-4.6-Flash-Lunar-Lander-131072-Max-Parameter-248x300.png 248w, https://www.philipconrod.com/wp-content/uploads/2025/12/GLM-4.6-Flash-Lunar-Lander-131072-Max-Parameter-768x929.png 768w, https://www.philipconrod.com/wp-content/uploads/2025/12/GLM-4.6-Flash-Lunar-Lander-131072-Max-Parameter-583x705.png 583w" sizes="auto, (max-width: 822px) 100vw, 822px" /></p>
<p>Below you can see the Task Manager Performance Chart for my NVIDIA GeForce RTX 3080;</p>
<p><img loading="lazy" decoding="async" class="alignnone size-large wp-image-12591" src="https://www.philipconrod.com/wp-content/uploads/2025/12/GLM-4.6-Flash-Lunar-Lander-131072-Context-Window-GPU-Performance-1030x775.png" alt="" width="1030" height="775" srcset="https://www.philipconrod.com/wp-content/uploads/2025/12/GLM-4.6-Flash-Lunar-Lander-131072-Context-Window-GPU-Performance-1030x775.png 1030w, https://www.philipconrod.com/wp-content/uploads/2025/12/GLM-4.6-Flash-Lunar-Lander-131072-Context-Window-GPU-Performance-300x226.png 300w, https://www.philipconrod.com/wp-content/uploads/2025/12/GLM-4.6-Flash-Lunar-Lander-131072-Context-Window-GPU-Performance-768x578.png 768w, https://www.philipconrod.com/wp-content/uploads/2025/12/GLM-4.6-Flash-Lunar-Lander-131072-Context-Window-GPU-Performance-705x531.png 705w, https://www.philipconrod.com/wp-content/uploads/2025/12/GLM-4.6-Flash-Lunar-Lander-131072-Context-Window-GPU-Performance.png 1326w" sizes="auto, (max-width: 1030px) 100vw, 1030px" /></p>
<p>GLM 4.6 scored 81% on GPQA, 93.9 on AIME 2025, and 69% on SWE-bench Verified.  The Open Source LLM Models keep getting better and better at coding with each new release and the releases keep coming out faster and faster.   Z.ai just released GLM 4.7 on December 22, 2025, and it quickly jumped to the top of the <a href="https://llm-stats.com/leaderboards/open-llm-leaderboard" rel="nofollow">Open-Source Leaderboard</a>.  The new GLM 4.7 scored 85.7% on GPQA, 96.7 on AIME 2025, and 73.8% on SWE-bench Verified. Once the new 4.7 Flash model gets uploaded to LM Studio, I&#8217;ll test the new 4.7 model as well.</p>
<p>Wishing you and your family a Merry Christmas and a Happy New Year!</p>
<p>&nbsp;</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>OpenAI Releases GPT-5 and it is State of the Art (SOTA) across Key Coding Benchmarks</title>
		<link>https://www.philipconrod.com/openai-releases-gpt-5-and-it-is-state-of-the-art-sota-across-key-coding-benchmarks/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=openai-releases-gpt-5-and-it-is-state-of-the-art-sota-across-key-coding-benchmarks</link>
		
		<dc:creator><![CDATA[Philip Conrod]]></dc:creator>
		<pubDate>Thu, 07 Aug 2025 21:12:17 +0000</pubDate>
				<category><![CDATA[News]]></category>
		<guid isPermaLink="false">https://www.philipconrod.com/?p=12524</guid>

					<description><![CDATA[OpenAI released GPT‑5 today and it is now state-of-the-art (SOTA) across key coding benchmarks, scoring 74.9% on SWE-bench Verified and 88% on Aider polyglot. SWE-bench Verified: (Tests AI models on real-world GitHub issues, evaluating their ability to generate accurate code patches) GPT-5 with Thinking (High) scored highest with 74.9%, followed closely by Claude Opus 4.1 [&#8230;]]]></description>
										<content:encoded><![CDATA[<p>OpenAI released GPT‑5 today and it is now state-of-the-art (SOTA) across key coding benchmarks, scoring 74.9% on <a href="https://llm-stats.com/" target="_blank" rel="nofollow noopener">SWE-bench Verified</a> and 88% on <a href="https://aider.chat/docs/leaderboards/" target="_blank" rel="nofollow noopener">Aider polyglot</a>.</p>
<p dir="auto"><strong>SWE-bench Verified:</strong> (Tests AI models on real-world GitHub issues, evaluating their ability to generate accurate code patches)</p>
<p dir="auto" style="text-align: left;">GPT-5 with Thinking (High) scored highest with 74.9%, followed closely by Claude Opus 4.1 at 74.5%.  For comparison, OpenAI o3 High scored at 69.1%</p>
<p dir="auto"><img loading="lazy" decoding="async" class="alignnone size-large wp-image-12532" src="https://www.philipconrod.com/wp-content/uploads/2025/08/OpenAI-GPT-5-vs-o3-SWE-bench-Verified-Coding-Benchmark-1030x792.png" alt="" width="1030" height="792" srcset="https://www.philipconrod.com/wp-content/uploads/2025/08/OpenAI-GPT-5-vs-o3-SWE-bench-Verified-Coding-Benchmark-1030x792.png 1030w, https://www.philipconrod.com/wp-content/uploads/2025/08/OpenAI-GPT-5-vs-o3-SWE-bench-Verified-Coding-Benchmark-300x231.png 300w, https://www.philipconrod.com/wp-content/uploads/2025/08/OpenAI-GPT-5-vs-o3-SWE-bench-Verified-Coding-Benchmark-768x590.png 768w, https://www.philipconrod.com/wp-content/uploads/2025/08/OpenAI-GPT-5-vs-o3-SWE-bench-Verified-Coding-Benchmark-705x542.png 705w, https://www.philipconrod.com/wp-content/uploads/2025/08/OpenAI-GPT-5-vs-o3-SWE-bench-Verified-Coding-Benchmark.png 1470w" sizes="auto, (max-width: 1030px) 100vw, 1030px" /></p>
<p dir="auto"><strong>Aider Polyglot: </strong>(Evaluates code editing across multiple programming languages (e.g., Java, Rust, Python):</p>
<p dir="auto">GPT-5 dominates with 88%, a substantial lead over competitors.  Grok 4 comes in next at 79.6%.  For comparison, OpenAI o3 came in at 81%.</p>
<p dir="auto"><img loading="lazy" decoding="async" class="alignnone wp-image-12528 size-full" src="https://www.philipconrod.com/wp-content/uploads/2025/08/OpenAI-GPT-5-vs-o3-Aider-Polyglot-Coding-Benchmark.png" alt="" width="580" height="756" srcset="https://www.philipconrod.com/wp-content/uploads/2025/08/OpenAI-GPT-5-vs-o3-Aider-Polyglot-Coding-Benchmark.png 580w, https://www.philipconrod.com/wp-content/uploads/2025/08/OpenAI-GPT-5-vs-o3-Aider-Polyglot-Coding-Benchmark-230x300.png 230w, https://www.philipconrod.com/wp-content/uploads/2025/08/OpenAI-GPT-5-vs-o3-Aider-Polyglot-Coding-Benchmark-541x705.png 541w" sizes="auto, (max-width: 580px) 100vw, 580px" /></p>
<p dir="auto">This new era of AI Assisted coding LLMs just keeps getting better and better each day!</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Using OpenAI GPT-OSS Open Weight Local LLM Model to Develop a Moon Landing Simulation Using C# on my Alienware Aurora R11 RTX-3080 10GB Video Card</title>
		<link>https://www.philipconrod.com/using-openai-gpt-oss-open-weight-local-llm-model-to-develop-a-moon-landing-simulation-using-c-on-my-alienware-aurora-r11-rtx-3080-10gb-video-card/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=using-openai-gpt-oss-open-weight-local-llm-model-to-develop-a-moon-landing-simulation-using-c-on-my-alienware-aurora-r11-rtx-3080-10gb-video-card</link>
		
		<dc:creator><![CDATA[Philip Conrod]]></dc:creator>
		<pubDate>Tue, 05 Aug 2025 22:41:39 +0000</pubDate>
				<category><![CDATA[News]]></category>
		<guid isPermaLink="false">https://www.philipconrod.com/?p=12472</guid>

					<description><![CDATA[OpenAI released their new Open Weights local LLM model today under the Apache License that you can download and run it locally on your own hardware without the need of a cloud subscription.  The larger 120 billion parameter model will require a system with an 80GB GPU (who has that!?) or a MAC M3/M4 system [&#8230;]]]></description>
										<content:encoded><![CDATA[<p style="text-align: left;">OpenAI released their new Open Weights local LLM model today under the Apache License that you can download and run it locally on your own hardware without the need of a cloud subscription.  The larger 120 billion parameter model will require a system with an 80GB GPU (who has that!?) or a MAC M3/M4 system with at least 128GB of Integrated/shared memory. The new AMD Ryzen AI 395 Max + 395 with 128GB can also run the 120B parameter model locally.    You can definitely run the smaller 4-bit quantization 20 billion parameter model locally using a NVidia GeForce 3090, 40&#215;0 or 50&#215;0 video card with at least 16GB of VRAM.  I personally downloaded the smaller 20B parameter model and got around 11 tokens per second using my RTX 3080 with 10GB using LM Studio.  Some of the GPT-OSS 20b model had to be loaded into my local system ram on my Alienware A11 which made it run much slower.</p>
<p style="text-align: left;">On a system with 16GB VRAM or 128GB of Integrated memory, the free GPT-OSS LLM performs pretty well on the Codeforces Competiton code benchmark tests against OpenAI&#8217;s other subscription-based cloud LLMs:</p>
<p style="text-align: left;"><img loading="lazy" decoding="async" class="size-large wp-image-12473 alignnone" src="https://www.philipconrod.com/wp-content/uploads/2025/08/Codeforces-chart-1030x828.png" alt="" width="1030" height="828" srcset="https://www.philipconrod.com/wp-content/uploads/2025/08/Codeforces-chart-1030x828.png 1030w, https://www.philipconrod.com/wp-content/uploads/2025/08/Codeforces-chart-300x241.png 300w, https://www.philipconrod.com/wp-content/uploads/2025/08/Codeforces-chart-768x617.png 768w, https://www.philipconrod.com/wp-content/uploads/2025/08/Codeforces-chart-705x567.png 705w, https://www.philipconrod.com/wp-content/uploads/2025/08/Codeforces-chart.png 1110w" sizes="auto, (max-width: 1030px) 100vw, 1030px" /></p>
<p style="text-align: left;">Source: Codeforces Benchmark from <a href="https://openai.com/index/introducing-gpt-oss/" target="_blank" rel="nofollow noopener">OpenAI</a></p>
<p style="text-align: left;">Since I only have 10 GB of VRAM on my RTX-3080, I tested GPT-OSS-20b with <a href="https://lmstudio.ai/blog/gpt-oss" target="_blank" rel="nofollow noopener">LM Studio</a> and offloaded some of the model into my 64GB of system memory.  I first tested it with a simple Hello World C++ application, and I got an average of 11 tokens per second dividing the model between high-speed Ge-Force VRAM and slow system RAM.</p>
<p style="text-align: left;">
<p style="text-align: left;"><img loading="lazy" decoding="async" class="size-full wp-image-12474 alignnone" src="https://www.philipconrod.com/wp-content/uploads/2025/08/LMStudio-on-RTX-3080.png" alt="" width="949" height="576" srcset="https://www.philipconrod.com/wp-content/uploads/2025/08/LMStudio-on-RTX-3080.png 949w, https://www.philipconrod.com/wp-content/uploads/2025/08/LMStudio-on-RTX-3080-300x182.png 300w, https://www.philipconrod.com/wp-content/uploads/2025/08/LMStudio-on-RTX-3080-768x466.png 768w, https://www.philipconrod.com/wp-content/uploads/2025/08/LMStudio-on-RTX-3080-705x428.png 705w" sizes="auto, (max-width: 949px) 100vw, 949px" /></p>
<p style="text-align: left;">
<p style="text-align: left;">I then tested it by building a much larger C# Moon Landing simulation and it ran out of context memory, and it did not complete the coding task.</p>
<p style="text-align: left;"><img loading="lazy" decoding="async" class="size-full wp-image-12475 alignnone" src="https://www.philipconrod.com/wp-content/uploads/2025/08/LMStudio-on-RTX-3080-GPU-System-Performance.png" alt="" width="950" height="578" srcset="https://www.philipconrod.com/wp-content/uploads/2025/08/LMStudio-on-RTX-3080-GPU-System-Performance.png 950w, https://www.philipconrod.com/wp-content/uploads/2025/08/LMStudio-on-RTX-3080-GPU-System-Performance-300x183.png 300w, https://www.philipconrod.com/wp-content/uploads/2025/08/LMStudio-on-RTX-3080-GPU-System-Performance-768x467.png 768w, https://www.philipconrod.com/wp-content/uploads/2025/08/LMStudio-on-RTX-3080-GPU-System-Performance-705x429.png 705w" sizes="auto, (max-width: 950px) 100vw, 950px" /></p>
<p style="text-align: left;">
<p style="text-align: left;"><img loading="lazy" decoding="async" class="size-full wp-image-12476 alignnone" src="https://www.philipconrod.com/wp-content/uploads/2025/08/PPT-OSS-Moon-Landing-Simulation-LMStudio-on-RTX-3080-GPU-System.png" alt="" width="1015" height="752" srcset="https://www.philipconrod.com/wp-content/uploads/2025/08/PPT-OSS-Moon-Landing-Simulation-LMStudio-on-RTX-3080-GPU-System.png 1015w, https://www.philipconrod.com/wp-content/uploads/2025/08/PPT-OSS-Moon-Landing-Simulation-LMStudio-on-RTX-3080-GPU-System-300x222.png 300w, https://www.philipconrod.com/wp-content/uploads/2025/08/PPT-OSS-Moon-Landing-Simulation-LMStudio-on-RTX-3080-GPU-System-768x569.png 768w, https://www.philipconrod.com/wp-content/uploads/2025/08/PPT-OSS-Moon-Landing-Simulation-LMStudio-on-RTX-3080-GPU-System-705x522.png 705w" sizes="auto, (max-width: 1015px) 100vw, 1015px" /></p>
<p style="text-align: left;">Update:  I finally figured out how to enlarge my context window on LM Studio to 131072 and was able to successfully build the Lunat Lander game:</p>
<p><img loading="lazy" decoding="async" class="alignnone size-full wp-image-12662" src="https://www.philipconrod.com/wp-content/uploads/2025/08/GPT-OSS-9B-Flash-Lunar-Lander-Full-Context-Window.png" alt="" width="822" height="994" srcset="https://www.philipconrod.com/wp-content/uploads/2025/08/GPT-OSS-9B-Flash-Lunar-Lander-Full-Context-Window.png 822w, https://www.philipconrod.com/wp-content/uploads/2025/08/GPT-OSS-9B-Flash-Lunar-Lander-Full-Context-Window-248x300.png 248w, https://www.philipconrod.com/wp-content/uploads/2025/08/GPT-OSS-9B-Flash-Lunar-Lander-Full-Context-Window-768x929.png 768w, https://www.philipconrod.com/wp-content/uploads/2025/08/GPT-OSS-9B-Flash-Lunar-Lander-Full-Context-Window-583x705.png 583w" sizes="auto, (max-width: 822px) 100vw, 822px" /></p>
<p style="text-align: left;"><img loading="lazy" decoding="async" class="alignnone size-full wp-image-12661" src="https://www.philipconrod.com/wp-content/uploads/2025/08/GPT-OSS-9B-Flash-Lunar-Lander-131072-Context-Window.png" alt="" width="822" height="994" srcset="https://www.philipconrod.com/wp-content/uploads/2025/08/GPT-OSS-9B-Flash-Lunar-Lander-131072-Context-Window.png 822w, https://www.philipconrod.com/wp-content/uploads/2025/08/GPT-OSS-9B-Flash-Lunar-Lander-131072-Context-Window-248x300.png 248w, https://www.philipconrod.com/wp-content/uploads/2025/08/GPT-OSS-9B-Flash-Lunar-Lander-131072-Context-Window-768x929.png 768w, https://www.philipconrod.com/wp-content/uploads/2025/08/GPT-OSS-9B-Flash-Lunar-Lander-131072-Context-Window-583x705.png 583w" sizes="auto, (max-width: 822px) 100vw, 822px" /></p>
<p style="text-align: left;">I definitely need to have at least 16 of VRAM, or better yet, 128GB of shared integrated memory in my development system to get faster local GPT-OSS LLM performance along with a bigger context window to complete this Moon Landing C# AI Assisted coding task.</p>
<p style="text-align: left;">Microsoft also <a href="https://azure.microsoft.com/en-us/blog/openais-open%e2%80%91source-model-gpt%e2%80%91oss-on-azure-ai-foundry-and-windows-ai-foundry/" target="_blank" rel="nofollow noopener">announced</a> today that you can also try it out in Foundry Local or AI Toolkit for VS Code (AITK) and start using it to build your Microsoft applications today using the GPT-OSS local LLM models. Let the free local GPT-OSS LLM Open Weights AI Assisted App &amp; Game Development begin!</p>
<p style="text-align: left;">
<p style="text-align: left;">
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
