Skip to content

The answer given by pyke is incorrect despite that the formation is correct. #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Shujie24 opened this issue Aug 2, 2023 · 0 comments

Comments

@Shujie24
Copy link

Shujie24 commented Aug 2, 2023

This is a similar problem to another issue:#1.
I encountered this issue when solving ProntoQA using pyke. One typical example is the following. The problem is ProntoQA_10:

{
    "id": "ProntoQA_10",
    "context": "Every impus is earthy. Each impus is a jompus. Jompuses are small. Jompuses are rompuses. Rompuses are not amenable. Rompuses are wumpuses. Wumpuses are wooden. Wumpuses are zumpuses. Every zumpus is temperate. Every zumpus is a dumpus. Dumpuses are dull. Dumpuses are vumpuses. Every vumpus is not shy. Every yumpus is sweet. Vumpuses are numpuses. Numpuses are not sweet. Numpuses are tumpuses. Fae is a wumpus.",
    "question": "Is the following statement true or false? Fae is sweet.",
    "options": [
      "A) True",
      "B) False"
    ],
    "answer": "B",
    "explanation": [
      "Fae is a wumpus.",
      "Wumpuses are zumpuses.",
      "Fae is a zumpus.",
      "Every zumpus is a dumpus.",
      "Fae is a dumpus.",
      "Dumpuses are vumpuses.",
      "Fae is a vumpus.",
      "Vumpuses are numpuses.",
      "Fae is a numpus.",
      "Numpuses are not sweet.",
      "Fae is not sweet."
    ]
  }

And the formation from natural language to program is also correct:

fact1
	foreach
		facts.Impus($x, True)
	assert
		facts.Earthy($x, True)

fact2
	foreach
		facts.Impus($x, True)
	assert
		facts.Jompus($x, True)

fact3
	foreach
		facts.Jompus($x, True)
	assert
		facts.Small($x, True)

fact4
	foreach
		facts.Jompus($x, True)
	assert
		facts.Rompus($x, True)

fact5
	foreach
		facts.Rompus($x, True)
	assert
		facts.Amenable($x, False)

fact6
	foreach
		facts.Rompus($x, True)
	assert
		facts.Wumpus($x, True)

fact7
	foreach
		facts.Wumpus($x, True)
	assert
		facts.Wooden($x, True)

fact8
	foreach
		facts.Wumpus($x, True)
	assert
		facts.Zumpus($x, True)

fact9
	foreach
		facts.Zumpus($x, True)
	assert
		facts.Temperate($x, True)

fact10
	foreach
		facts.Zumpus($x, True)
	assert
		facts.Dumpus($x, True)

fact11
	foreach
		facts.Dumpus($x, True)
	assert
		facts.Dull($x, True)

fact12
	foreach
		facts.Dumpus($x, True)
	assert
		facts.Vumpus($x, True)

fact13
	foreach
		facts.Vumpus($x, True)
	assert
		facts.Shy($x, False)

fact14
	foreach
		facts.Yumpus($x, True)
	assert
		facts.Sweet($x, True)

fact15
	foreach
		facts.Vumpus($x, True)
	assert
		facts.Numpus($x, True)

fact16
	foreach
		facts.Numpus($x, True)
	assert
		facts.Sweet($x, False)

fact17
	foreach
		facts.Numpus($x, True)
	assert
		facts.Tumpus($x, True)

But after giving these to pyke, the output prediction is A rather than the correct answer B.
Are there some problems with pyke? Is pyke giving the correct answer?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant