First, let's talk about the 128 gb memory limit for SQL Server Standard Edition. As Erik Darling mentioned in the comments, this Standard Edition limit is for the buffer pool/database cache only. SQL Server has plenty of other memory uses including the column store object pool and used sort/hash query memory, which are both in the stolen memory category and thus not limited by the database cache limit.
Each of the values below will be equal (with the exception of slight timing issues during growth/shrink of the buffer pool) so any of these locations can be checked to see current size of the database cache.
SQL Server doesn't "reality check" the value in place for the Max Server Memory configuration - the UI won't balk at any value unless it's a numeric overflow, there won't be any message in the SQL Server error log indicating the value is one which can't be achieved on the given system for any reason.
If SQL Server decides it can achieve/maintain the Max Server Memory value at any given time - given system RAM and memory use at the time - it will use that value as Target Server Memory (KB). Otherwise it will select a lower, achievable value. SQL Server listens for low memory notifcations from the OS and if a low memory notifcation occurs and SQL Server can release some memory, it will. In order to be a good neighbor.
Total Server Memory (KB) is the sum of database cache, free SQLOS memory, and stolen memory (everything except database cache including plan cache, lock memory, column store, etc).
The goal (my goal anyway) is to keep Total Server Memory = Target Server Memory = Max Server Memory. That's not always possible especially when during batch mode execution aggressive memory allocations contribute to Total Server Memory exceeding Target Server Memory. Anyway...
In Standard Edition, there is currently no access to the Resource Governor. So user queries have a non-configurable maximum memory grant per query of 25%. But... 25% of what?
The "what" is the target memory for the relevant resource semaphore. Every Resource Governor Resource Pool has two resource semaphores for query memory - a regular and a small one. In Standard Edition, with no access to Resource Governor, all user queries are going to go to the same set of resource semaphores (basically the default pool's semaphores). Here since we're talking about large memory grants, we're talking about the regular resource semaphore.
Here's a way to check up on the resource semaphore action. This query was lucky enough to be the only thing running on the system at the time. It's in the default resource pool, and it's so small it goes into the small semaphore instead of the regular semaphore. So this query is the lone grantee, it got a grant of 1136 kb, and at the time of reporting it was using 176 kb. If a query plan had a really, really high memory estimate it could max out the memory grant - in which case the most it could ask for is 25% of the regular semaphore for the pool it is in. (Or in Standard Edition, the only pool available for user queries :-))
The target for regular resource semaphores is not a single fixed value - it is adaptive just as other SQL Server memory targets are. A significant amount of query memory activity in the internal resource pool (something like multiple async auto stats updates for really big tables) will cause the target for the default pool regular semaphore to decrease. If the column store object pool suddenly grows really large at the expense of other use within SQL Server Total Server Memory, the regular semaphore target for resource pools can decrease. In which case, the 25% limit also decreases - not because the numerator changes but because the denominator changes.
You can keep track of these details from perfmon counters, eg the query below this paragraph. One detail to note: Maximum Workspace Memory (KB) is for the default pool only - in Standard Edition it's your only shot for user queries, but it's important to keep that in mind in Enterprise Edition where additional user pools may have been added. And the Maximum Workspace Memory (KB) is the sum of the small and regular semaphore targets. If there's a 25% workspace per query limit, 25% of the Maximum Workspace Memory (KB) counter will be just a smidgeon higher than the actual limit since it has the small semaphore target lumped in.
The size of memory grants can be a constant challenge. My experience is that the difference from actual runtime reality for SQL Server memory estimates is larger than the difference between rowcount estimates and actual rows, by several times. And the ideal memory estimate - the estimate before considering the per query workspace memory max - is almost always significant higher than what is needed (ime, your mileage may vary). That makes sense, if performance of a query running in isolation is the main paradigm. It makes less sense, for example, when someone like me comes along that wants to have the system hit really high cpu utilization doing meaningful work whenever there is work to do Most of the systems I work with couldn't come close to their potential without tuning max memory grant per query with the Resource Governor.
Finally, let's talk about large memory grants. A really large memory grant is not necessarily a sign of a poorly written query or a need to change the schema, update stats, etc. Sometimes it just is - like other limitations of cardinality estimation or query optimization.
Exhibit A: dbcc checkdb, with logical checks (ie without the physical_only option). As tables grow, memory estimates, ideal memory, and grant requests will grow. Till they hit the limit. And there ain't much to do about it, other than tune it down with Resource Governor.
Exhibit B: bulk insert into clustered columnstore indexes. SQL Server uses a range of memory estimates based on a predicted range of rowgroup quality, trying to balance resources used, performance and end result. Sometimes even the lower bound of that range for CCI bulk insert is quite a bit more than the query uses. Not much to do about that. In fact, try to "right size" the max grant too much, lowering it too close to what is actually used by the insert query, might just end up lowering the DOP from requested DOP, or worse yet... making SQL Server abandon bulk insert and instead go for trickle insert and use of delta stores.
So... there ya go. Check Target Server Memory (KB) and Total Server Memory (KB) in your Standard Edition SQL Server instance. They aren't directly bound by the 128 gb limit. You can check the database cache size to verify it is bound by the 128 gb limit. And you can check the regular resource semaphore target to see what the 25% limit is calculated against.