GPT4 With Reflexion Has a Superior Coding Score
by Brian Wang from NextBigFuture.com on (#6AK8A)
A slightly improved Reflexion-based GPT-4 agent achieves state-of-the-art pass@1 results (88%) on HumanEval, outperforming GPT-4 (67.0%) and CodeT: Code Generation with Generated Tests (65.8%), which were the previous state-of-the-art standards. Relaxing Success Evaluation By using Reflexion to iteratively refine the current implementation, researchers are shifting the accuracy bottleneck" from correct syntactic and semantic code generation ...