For part a, you made two errors. First, the question specifically stated "page table entries are rounded up to 4 bytes". Second, the PTE contains the bits needed to determine the physical address based on the address being page aligned. In the described system, physical addresses are only 30 bits (1 GiB). Since that system uses 4KiB pages, the least significant 12 bits of the physical address in the PTE would be all zeros and so can be implicit. So just for the physical address 18 bits (30-12) are needed.
Aside from the desirability of rounding to a power of two number of bytes, most PTEs include additional data such as a valid bit, a modified bit, an accessed bit, and permission bits for user and supervisor modes; so even with a 512 MiB physical address space and 8 KiB pages (16 bits needed to indicate the physical address), one could not use 2 byte PTEs.
(It should be noted that no 32-bit processor would use a flat page table. For 32-bit addresses, hierarchical or linear page tables are generally used. These introduce a little extra space overhead for full occupancy and can require multiple memory accesses to find a translation, but in the common case of partial occupancy and dense allocation they use substantially less memory. This is particularly significant since most processors are designed for multiple address space OSes where each process has its own page table. Using almost half of physical memory [400 MiB] in page tables to support just 100 processes is understandably unattractive.)
For part b, you are correct that being 4-way set associative means that there are 4 blocks in each set and so 2 bits are subtracted from the number of bits needed for indexing based on the number of entries. However, log2(256) is 8 not 10, so only 6 bits are used for indexing the TLB.
In a data cache, the tag size would equal the number of address bits minus the number of index bits, minus the number of offset bits (within the cache block).
For a TLB, the virtual address is aligned to the size of the page (the least significant bits within the page are untranslated). For 4 KiB pages, this means that the 12 least significant bits are ignored. With a 32-bit virtual address, this leaves 20 bits.
That 6 of these bits are used for indexing has already been determined, so 14 bits are left.
For non-clustered TLB, each tag is associated with one translation. This would be equivalent to a data cache block size of 1 byte (i.e., 0 offset bits). Therefore, the tag (excluding any Address Space ID) would be 14 bits.
(In a clustered TLB [analogous to sectored cache blocks], two or more translations are provided for each "entry"--entry becomes a less clear term as it could refer to the translation entry or to the combination of the tag and multiple translations associated with that tag. [I suspect you appreciate such complexities not being part of these problems.])