It is precise journalistic language. The writer is a reporter. Reporters are supposed to report facts, not express their own opinions.
Was it a terrorist attack? The reporter doesn't know. They know what certain people _said_.
So they don't say people were killed "in a terrorist attack."
Notice the levels of word grouping in this sentence. "State media" is a unit. It means the official government media of the country. "State media and regional authorities" is a large unit. This is the set of people who called it a terrorist attack.
The whole phrase "what state media and regional authorities described as a terrorist attack" is a careful phrase that makes it clear that they are just reporting what someone said. They are not saying it is true.
It is followed by another long phrase. The reporter mentions the explosions, and then follows it with two long phrases: "in what X described as Y, and which Z blamed on W."
There are two sides to the story. State media and regional authorities said one thing. The government in Tehran said something different. The journalist is try to write one long sentence that indicates, very quickly, that there are two sides to the story. The journalist is not taking sides, but is reporting what authorities said.